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Preface 



This volume contains the proceedings of MPC 2004, the Seventh International 
Conference on the Mathematics of Program Construction. This series of con- 
ferences aims to promote the development of mathematical principles and tech- 
niques that are demonstrably useful in the process of constructing computer pro- 
grams, whether implemented in hardware or software. The focus is on techniques 
that combine precision with conciseness, enabling programs to be constructed 
by formal calculation. Within this theme, the scope of the series is very diverse, 
including programming methodology, program specification and transformation, 
programming paradigms, programming calculi, and programming language se- 
mantics. 

The quality of the papers submitted to the conference was in general very 
high, and the number of submissions was comparable to that for the previous 
conference. Each paper was refereed by at least four, and often more, committee 
members. 

This volume contains 19 papers selected for presentation by the program 
committee from 37 submissions, as well as the abstract of one invited talk: Ex- 
tended Static Checking for Java by Greg Nelson, Imaging Systems Department, 
HP Labs, Palo Alto, California. 

The conference took place in Stirling, Scotland. The previous six conferences 
were held in 1989 in Twente, The Netherlands; in 1992 in Oxford, UK; in 1995 in 
Kloster Irsee, Germany; in 1998 in Marstrand near Goteborg, Sweden; in 2000 in 
Ponte de Lima, Portugal; and in 2002 in Dagstuhl, Germany. The proceedings of 
these conferences were published as LNCS 375, 669, 947, 1422, 1837, and 2386, 
respectively. 

Three other international events were co-located with the conference: the 
Tenth International Conference on Algebraic Methodology And Software Tech- 
nology (AMAST 2004), the Sixth AMAST Workshop on Real-Time Systems 
(ARTS 2004), and the Fourth International Workshop on Constructive Meth- 
ods for Parallel Programming (CMPP 2004). We thank the organizers of these 
events for their interest in sharing the atmosphere of the conference. 
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Extended Static Checking for Java 



Greg Nelson 

Imaging Systems Department 
HP Labs 



Mail Stop 1203 
1501 Page Mill Road 
Palo Alto, CA 94304, USA 

gnelson@hp . com 



Abstract. The talk provides an overview and demonstration of an Ex- 
tended Static Checker for the Java programming language, a program 
checker that finds errors statically but has a much more accurate seman- 
tic model than existing static checkers like type checkers and data flow 
analysers. For example, ESC/Java uses an automatic theorem-prover and 
reasons about the semantics of assignments and tests in the same way 
that a program verifier does. But the checker is fully automatic, and 
feels to the programmer more like a type checker than like a program 
verifier. A more detailed account of ESC/Java is contained in a recent 
PLDI paper [1]. The checker described in the talk and in the PLDI paper 
is a research prototype on which work ceased several years ago, but Joe 
Kiniry and David Cok have recently produced a more up-to-date checker, 
ESC/Java 2 [2], 

References 

1. Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James 
B. Saxe, and Raymie Stata. Extended Static Checking for Java. Proc. PLDI'02. 
ACM. Berlin, Germany, 2002. 

2. David Cok and Joe Kiniry. ESC/Java 2 project page, 
http : // wot . cs .kun.nl/ sos/research/esc java/main.html. 
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Abstract. The efficient representation and manipulation of data is one of the 
fundamental tasks in the construction of large software systems. Parametric 
polymorphism has been one of the most successful approaches to date but, as 
of yet, has not been applicable to programming with quotient datatypes such 
as unordered pairs, cyclic lists, bags etc. This paper provides the basis for 
writing polymorphic programs over quotient datatypes by extending our recently 
developed theory of containers. 



1 Introduction 

The efficient representation and manipulation of data is one of the fundamental tasks 
in the construction of large software systems. More precisely, one aims to achieve 
amongst other properties: i) abstraction so as to hide implementation details and thereby 
facilitate modular programming; ii) expressivity so as to uniformly capture as wide a 
class of data types as possible; iii) disciplined recursion principles to provide convenient 
methods for defining generic operations on data structures; and iv) formal semantics to 
underpin reasoning about the correctness of programs. The most successful approach 
to date has been Hindley-Milner polymorphism which provides predefined mechanisms 
for manipulating data structures providing they are parametric in the data. Canonical 
examples of such parametric polymorphic functions are the map and fold operations 
which can be used to define a wide variety of programs in a structured and easy to 
reason about manner. 

However, a number of useful data types and associated operations are not express- 
ible in the Hindley-Milner type system and this has lead to many proposed extensions 
including, amongst others, generic programming, dependent types (Altenkirch and 
McBride, 2003), higher order types (Fiore et al., 1999), shapely types (Jay, 1995), 
imaginary types (Fiore and Leinster, 2004) and type classes. However, one area which 
has received less attention is that of quotient types such as, for example, unordered 
pairs, cyclic lists and the bag type. This is because the problem is fundamentally rather 
difficult - on the one hand one wants to allow as wide a theory as possible so as to 
encompass as many quotient types as possible while, on the other hand, one wants to 
restrict one’s definition to derive a well-behaved meta-theory which provides support 
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for key programming paradigms such as polymorphic programming etc. Papers such 
as Hofmann (1995) have tended to consider quotients of specific types rather than 
quotients of data structures which are independent of the data stored. As a result, this 
paper is original in giving a detailed analysis of how to program with quotient data 
structures in a polymorphic fashion. In particular, 

• We provide a syntax for declaring quotient datatypes which encompasses a variety 
of examples. This syntax is structural which we argue is essential for any theory of 
polymorphism to be applicable. 

• We show how the syntax of such a declaration gives rise to a quotient datatype. 

• We provide a syntax for writing polymorphic programs between these quotient 
datatypes and argue that these programs do indeed deserve to be called 
polymorphic. 

• We show that every polymorphic function between our quotient datatypes is 
represented uniquely by our syntax. That is, our syntax captures all polymorphic 
programs in a unique manner. 

To execute this program of research we extend our work on container datatypes 
(Abbott, 2003; Abbott et al., 2003a, b). Container types represent types via a set of 
shapes and locations in each shape where data may be stored. They are therefore like 
Jay’s shapely types (Jay, 1995) but more general as we discuss later. In previous papers 
cited above, we have shown how these container types are closed under a wide variety 
of useful constructions and can also be used as a framework for generic programming, 
eg they support a generic notion of differentiation (Abbott et ah, 2003b) which derives 
a data structure with a hole from a data structure. 

This paper extends containers to cover quotient datatypes by saying that certain 
labellings of locations with data are equivalent to others. We call these structures 
quotient containers. As such they correspond to the step from normal functors to 
analytic functors in Joyal (1986). However, our quotient containers are more general 
than analytic functors as they allow infinite sets of positions to model coinductive 
datatypes. In addition, our definition of the morphisms between quotient containers 
is new as is all of their applications to programming. In addition, while pursuing the 
above program, we also use a series of running examples to aid the reader. We assume 
only the most basic definitions from category theory like category, functor and natural 
transformations. The exception is the use of left Kan extensions for which we supply 
the reader with the two crucial properties in section 2. Not all category theory books 
contain information on these constructions, so the reader should use Mac Lane (1971); 
Borceux (1994) as references. 

The paper is structured as follows. In section 2 we recall the basic theory of 
containers, container morphisms and their application to polymorphic programming. 
We also discuss the relationship between containers and shapely types. In section 3 
we discuss how quotient datatypes can be represented in container theoretic terms 
while in section 4 we discuss how polymorphic programs between quotient types can 
be represented uniquely as morphisms between quotient containers. We conclude in 
section 5 with some conclusions and proposals for further work. 
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2 A Brief Summary of Containers 



Notation: We write N for the set of natural numbers and if n £ N, we write n for the 
set {0, ... ,n — 1}. We assume the basic dehnitions of category theory and if / : X — > Y 
and g : Y — > Z are morphisms in a category, we write their composite go f : X — Z as is 
standard categorical practice. We write K\ for the constantly 1 valued functor from any 
category to Sets. If A is a set and B is an A indexed family of sets, we write X a: A. B(a) 
for the set {(a,b) \ a £ A,b £ B(a) } . We write ! for the empty map from the empty set 
to any other set. Injections into the coproduct are written ini and inr. 

This paper uses left Kan extensions to extract a universal property of containers 
which is not immediately visible. We understand that many readers will not be familiar 
with these structures so we supply all definitions and refer the reader to Mac Lane 
(1971) for more details. Their use is limited to a couple of places and hence doesn’t 
make the paper inaccessible to the non-cogniscenti. Given a functor I : sZ — > BS and 
a category , precomposition with I defines a functor _o/ : — > \</Z ZZi\. The 

problem of left Kan extensions is the problem of finding a left adjoint to _ o I. More 
concretely, given a functor F : &Z — » ‘if, the left Kan extension of F along I is written 
Lan i F defined via the natural isomorphism 

One can use the following coend formula to calculate the action of a left Kan extension 
when c to = Sets and sZ is small 

rAesZ 

(Lan ,F)X = / 3S(IA,X) x FA (2) 



What Are Containers? Containers capture the idea that concrete datatypes consist of 
memory locations where data can be stored. For example, any element of the type of 
lists List(X) of X can be uniquely written as a natural number n given by the length of 
the list, together with a function {0, . . . ,n — 1} — > X which labels each position within 
the list with an element from X. Thus we may write 



List(X) = £n:N. {0, n- 1} ^ X 



( 3 ) 



We may think of the set {0 1} as n memory locations while the function / 
attaches to these memory locations, the data to be stored there. Similarly, any binary tree 
tree can be uniquely described by its underlying shape (which is obtained by deleting 
the data stored at the leaves) and a function mapping the positions in this shape to the 
data thus: 




More generally, we are led to consider datatypes which are given by a set of shapes S 
and, for each s £ S, a set of positions P(s) which we think of as locations in memory 
where data can be stored. This is precisely a container 
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Definition 2.1 (Container). A container (S o P) consists of a set S and, for each s £ S, 
a set of positions P(s). 

Of course, in general we do not want to restrict ourselves to the category of sets since 
we want our theory to be applicable to domain theoretic models. Rather, we would 
develop our theory over locally cartesian closed categories (Hofmann, 1994), certain 
forms of hbrations such as comprehension categories (Jacobs, 1999) or models of 
Martin-Lof type theory - see our previous work (Abbott, 2003; Abbott et al., 2003a,b) 
for such a development. However, part of the motivation for this paper was to make 
containers accessible to the programming community where we believe they provide a 
flexible platform for supporting generic forms of programming. Consequently, we have 
deliberately chosen to work over Sets so as to enable us to get our ideas across without 
an overly mathematical presentation. 

As suggested above, lists can be presented as a container 

Example 2.2 The list type is given by the container with shapes given by the natural 
numbers N and, for n € N, define the positions P{n) to be the set {0, . . . , n — 1 }. 

To summarise, containers are our presentations of datatypes in the same way 
that data declarations are presentations of datatypes in Haskell. The semantics of a 
container is an endofunctor on some category which, in this paper, is Sets. This is given 
by 

Definition 2.3 (Extension of a Container). Let ( S i> P) be a container. Its semantics, 
or extension, is the functor Tst>p : Sets — > Sets defined by 

T s>P (X) = J j s:S. (P(s)-*X) 

An element of Ts > p(X) is thus a pair ( s, f ) where s € S is a shape and / : P(s) — > X is a 
labelling of the positions over ,v with elements from X. Note that If.p really is a functor 
since its action on a function g : X — > Y sends the element (s. f ) to the element (s, g o f) . 
Thus for example, the extension of the container for lists is the functor mapping X to 

2>:N. {0, ...,n— 1} — . 

As we commented upon in equation 3. this is the list functor. 

The theory of containers was developed in a series of recent papers (Abbott, 2003; 
Abbott et al., 2003a,b) which showed that containers encompass a wide variety of 
types as they are closed under various type forming operations such as sums, products, 
constants, fixed exponentiation, (nested) least fixed points and (nested) greatest fixed 
points. Thus containers encapsulate a large number of datatypes. So far, we have dealt 
with containers in one variable whose extensions are functors on Sets. The extension to 
n-ary containers, whose extensions are functors Sets” — > Sets, is straightforward. Such 
containers consist of a set of shapes S, and for each s <E S there are n position sets P n {s). 
See the above references for details. 

We finish this section with a more abstract presentation of containers which will be 
used to exhibit the crucial universal property that they satisfy. This universal property 
underlies the key result about containers. First, note that the data in a container ( S t> P) 
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can be presented as a functor P : S — ^ Sets where here we regard the set S as a discrete 
category and P maps each .v to P(s). In future, we will switch between these two views 
of a container at will. The semantic functor Ts^p has a universal property given by the 
following lemma. 

Lemma 2.4. Let P : S — Sets be a container. Then T$c>p is the left Kan extension ofK\ 
along P 



Sets 



K 



Tst>p = LanpA'i 



Sets 



Proof We calculate as follows 

rs-.S 

(Lan P Ki)X = / Sets (Ps,X)xK lS = ^s:S. Sets (Ps,X) = T S>P {X ) 

where the first equality is the classic coend formula for left Kan extensions of equation 
2, the second equality holds as S is a discrete category and 1 is the unit for the product, 
and the last equality is the definition of If.piX). □ 

As we shall see, this universal property will be vital to our representation theorem. 



2.1 Container Morphisms 

Containers are designed for implementation. Thus, we imagine defining lists in a 
programming language by writing something like 

data List = (n : N > n) 

although the type dependency means we need a dependently typed language. If we 
were to make such declarations, how should we program? The usual definitions of lists 
based upon initial algebras or final coalgebras give rise naturally to recursive forms 
of programming. As an alternative, we show that all polymorphic functions between 
containers are captured uniquely by container morphisms. 

Consider first the reverse function applied to a list written in container form (n,g). 
Its reversal must be a list and hence of the form (n! ,g'). In addition, n! should only 
depend upon n since reverse is polymorphic and hence shouldn’t depend upon the actual 
data in the list given by g. Thus there is a function N - * N. In the case of reverse, 
the length of the list doesn’t change and hence this is the identity. To define g' which 
associates to each position in the output a piece of data, we should first associate to 
each position in the output a position in the input and then look up the data using g. 



Constructing Polymorphic Programs with Quotient Types 



7 



Pictorially we have: 



h lo 




Here we start with a list Iq which has 4 positions and a labelling function g into X. 
The result of reversing Iq is the list l\ with 4 positions and labelling function given by 
the above composite. In general, we therefore define 

Definition 2.5 (Container Morphisms). A morphism (A t> B) — » (C > D) consists of a 
pair ( u,f ) where u : A — > C and an A-indexed family of maps f a : D(ua ) — > B{a). The 
category of containers and container morphisms is written Cont 

Example 2.6 The reverse program is given by the container morphism (Id . f ) where 
the function f n : {0, . . . ,n — 1} — > {0, . . . ,n — 1} is defined by f n {i) = n — i— 1 

We stress that this would be the actual definition of reverse in a programming language 
based upon containers. The process of translating other, more abstract and higher level, 
definitions of reverse into the above form indicates the potential use of containers as 
an optimisation tool. Note also how /„ says that the data stored at the i'th cell after 
reversing a list is that data stored at the n — i — l’th cell in the input list. 

Consider the tail function tail:: List(X) — ► 1 + List(X). The shapes of the datatype 
1 + List(X) is 1 +N with the shapes above inl(*) being empty while the shapes above 
inr(n) is the set {0, ... ,n — 1}. We therefore write this container as (1 +n : N > 0 + «)■ 

Example 2.7 The tail function (m,/) is given by the container morphism 
( n : N > n) — > (l+«:Ni>0 + n) defined by 

m(0) = inl(*) u(n+ 1) = inr(n) 

and with fo = ! and /„+ 1 : n — * n+ 1 defined by f„+ \ (. i ) = i + 1. 

Thus the i’th cell in the output of a nonempty list comes from the i+ 1 ’th cell in the input 
list. Readers may check their understanding at this point by wondering what function 
is defined by setting f n+ \ (i) = i in the above example. We finish this section with two 
final points. First, a categorical re-interpretation of a container morphism analogous 
to the categorical interpretation of a container A > B as a presheaf It : A - ■ Sets as in 
lemma 2.4. 

Lemma 2.8. A morphism of containers {u^f) : (A > B) — > (C > D) is given by a functor 
u:A-^C and a natural transformation f : Du — > B. Pictorially, this can be represented: 



A 



u 



C 
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Proof Since A and C are discrete categories, a functor is just a function. Since A is 
discrete, the natural transformation is just a family of maps of the given form. □ 

Finally, containers are more than just a programming tool - they can also simplify 
reasoning. For example, consider proving that reverse o reverse is the identity. Using 
the usual recursive definition one soon runs into problems and must strengthen the 
inductive hypothesis to reflect the interaction of reverse and append. Using the container 
definition, this problem is trivial as we can reason as follows: (Id .f) o (Id, f) = 
(Id,f o f) = (Id, Id) since, for each n, the function /„ is clearly idempotent. 

2.2 From Container Morphisms to Polymorphic Functions and Back 

Given a container morphism (u,f) : (A > B) — * (S > P) does (u,f) really define a 
polymorphic function Taob — * Tsc P ? If so, are a ll polymorphic functions of this form?. 
And in a unique manner? To answer these questions we have to describe mathematically 
what a polymorphic function is. In the theory of program language semantics, covariant 
datatypes are usually represented as functors while polymorphic functions between 
covariant datatypes are represented by natural transformations (Bainbridge et al., 1990). 
Other representations of polymorphic functions are as terms in various polymorphic 
lambda calculi and via the theory of logical relations. Various theoretical results show 
that these are equivalent so in the following we take a polymorphic function to be a 
natural transformation. 

Our key theorem is the following which ensures that our syntax for defining 
polymorphic functions as container morphisms is flexible enough to cover all 
polymorphic functions. 

Theorem 2.9. Container morphisms (A t> B) — > (C t> D) are in bijection with natural 
transformations Ta^b — > Tq>d ■ Formally , T : Cont — » [Sets, Sets] is full and faithful. 

Proof The proof is a special case of Theorem 4.3. Alternatively, see Abbott et al. 
(2003a); Abbott (2003) □ 

Containers vs Shapely Types: In Jay and Cockett (1994) and Jay (1995) shapely types 
(in one parameter) in Sets are pullback preserving functors F : Sets — > Sets equipped 
with a cartesian natural transformation to the list functor. This means there are natural 
maps FX — > List (A) which extract the data from an element of FX and place it in a 
list thereby obtaining a decomposition of data into shapes and positions similar to what 
occurs in containers. 

Note however that the positions in a list have a notion of order and hence a shapely 
type is also equipped with an order on positions. Typically, when we declare a datatype 
we do not want to declare such an ordering over it and, indeed, in the course of 
programming we may wish to traverse a data structure with different orders. At a 
more theoretical level, by reducing datatypes to lists, a classification theorem such as 
Theorem 2.9 would reduce polymorphic functions to polymorphic functions between 
lists but would not be able to classify what these are. Containers do not impose such an 
order and instead reduce datatypes to the more primitive idea of a family of positions 
indexed by shapes. 
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3 Containers and Quotient Types 

The purpose of this paper is to extend our previous results on containers to cover 
quotient datatypes and polymorphic functions between them. In particular we want to 

• Generalise the notion of container to cover as many quotient datatypes as possible 
and thereby derive a syntax for declaring quotient datatypes. 

• Define a notion of morphism between such generalised containers and thereby 
derive a syntax for polymorphic programming with quotient types. 

• Prove the representation theorem for these polymorphic programs thereby proving 
that the quotient container morphisms really are polymorphic functions and that all 
such polymorphic functions are captured by quotient container morphisms. 

3.1 Unordered Pairs and Cyclic Lists 

We begin with an example of unordered pairs. Note first that the type X x X is given by 
the container (1 > 2). These are ordered pairs of elements of X. The type of unordered 
pairs of X is written as X 0 X and is defined asXxX/~ where ~ is the equivalence 
relation defined by 

(x,y) ~ (y,x) 

Recall our analysis of the pair type was as one shape, containing two positions (1 1 > 2). 
Lets call these positions p\ and pi- An element of the pair type then consists of a 
labelling for these positions, ie a function / : {pi,P 2 } — > X for a set X. To move from 
ordered pairs to unordered pairs is exactly to note that the labelling / should be regarded 
the same as the labelling / oswap where swap : {pi,P 2 } — > {piiPi} is the function 
sending p\ to pi and pi to p \ . Thus 

Example 3.1 The type of unordered pairs of elements ofX is given by 



where ~ is the equivalence relation on {/?i , p 2 } X obtained by setting f o swa p ~ /. 

Let’s cement our intuitions by doing another example. A cyclic list is a list with no 
starting or ending point. Here is a cyclic list of length 5 



Can we represent cyclic lists in the same style as we used for unordered pairs? 

Recall from equation 3 that lists were given by List(Z) = f_n : N. {0, n— 1}— >X. 

Now, in a cyclic list of length n, a labelling / : {0, n — 1} — > X should be equivalent 

to the labelling / o A i. (i + k) mod n where k € {0, . . . , n — 1 } . Thus we may define 



({PhP2}^X)/ 
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Example 3.2 The type of cyclic lists of elements ofX is given by 

CList(X) = Yj n ' N- ({0, n — 1} —*X)/ 

where is the equivalence relation on {0, n — 1} — >■ X obtained by setting f 

/o Xi.(i + k) mod n where k £ {0, . . . ,n — 1}. 

The observant reader will have spotted that, in example 3.2, there is actually an 
equivalence relation for each shape. Examples 3.1 and 3.2 exemplify the kind of 
structures we wish to compute with, ie the structures for which we want to find a clean 
syntax supporting program construction and reasoning. In general they consist of a 
container as defined before with an equivalence relation on labellings of positions. To 
ensure the equivalence relation is structural, ie independent of data as one would expect 
in a polymorphic setting, the equivalence relation is defined by identifying a labelling 
/ : P(s) — > X with the labelling / o a where a is one of a given set of isomorphisms on 
P(s). Hence we define 

Definition 3.3 (Quotient Containers). A quotient container (S o P/G) is given by a 
container ( S > P) and, for each shape s £ S, a set G(s) of isomorphisms ofP(s) closed 
under composition, inverses and containing the identity. 

Thus a quotient container has an underlying container and, every container as in 
Definition 2.1 is a quotient container with, for each shape s , the set G(s) containing 
only the identity isomorphism on P(s). Another way of describing the isomorphisms in 
a quotient container (S > P/G) is to say that for each s £ S, G(s) is a subgroup of the 
automorphism, or permutation, group on P(s). 

Often we present a quotient container ( S > P/G) by defining, for each shape s £ S, 
the group G(s) to be the smallest group containing a given set. However, the advantage 
of requiring G(s) to be a group is that if we define / f' iff there is a g £ G(s) such 
that f = f o g, then is automatically an equivalence relation and so we don’t have 
to consider its closure. A more categorical presentation of quotient containers reflecting 
the presentation of containers used in lemma 2.4 is the following. 

Lemma 3.4. A quotient container is exactly a functor P : S — Sets where every 
morphism ofS is both an endomorphism and an isomorphism. 

Proof Given a quotient container (S i > P/G), we think of S as the category with objects 
elements s £ S and, as endomorphisms of s, the set G(s). The functor P is the obvious 
functor mapping s to P(s). □ 

Given a quotient container, ( S > P/G), of course we want to calculate the associated 
datatype or functor T St>P i G : Sets — » Sets. As with the presentation of containers we do 
this concretely and then more abstractly to uncover a hidden universal property. 

Definition 3.5 (Extension of a Quotient Container). Given a quotient container, say 
(S > P/G), its extension is the functor T S>P / G : Sets — ■> Sets defined by 

T s , p/g (X)=^s:S. (P(s)^X)/~ s 

where is the equivalence relation on the set of functions P(s) — > X defined by f f 

if there is a g £ G(s) such that f = fog. 
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So a quotient container is like a container except that in the datatype it gives rise to, a 
labelling / of positions with data is defined to be the same as the labelling fog obtained 
by bijectively renaming the positions using g £ G(s) and then performing the labelling 
/. The more abstract formulation is given by the next lemma which uses lemma 3.4 to 
regard a quotient container (5 > P/G) as a presheaf P : S — > Sets. 

Lemma 3.6. Let P : S —* Sets be a quotient container. Then T^p/q is the left Kan 
extension ofK\ along P 




t s>p/g = LznpK\ 



Proof We can calculate the left Kan extension as follows 

rs:S 



/ s' iS 

Sets (Ps,X) x K\s 

= ^s:S. Sets (Ps,X)/ ~ s 
= T s>p/g (X) 



where by first equality the classic coend formula of equation 2, the second is the 
reduction of coends to colimits and the third is the definition of T St>P / G . This is because, 
the equivalence relation ~ s in the coend has / ~ v f iff there is a g : s — > s such that 
f = / c P(g) where /,/' : P(s) — > X. This is exactly the definition of the extension of 
a quotient container. □ 



The theory of containers thus generalises naturally to quotient containers as the 
same formula of left Kan extension calculates both the semantics of a container and 
that of a quotient container. 

We finish this section on the presentation of quotient types with the example of 
finite bags (also called multisets) as promised. Bags of other sizes are of course a 
straightforward generalisation. Given this remark, we henceforth refer to finite bags 
as simply bags. The key intuition is that a bag is like a set but elements may have 
multiple occurrences - one may thus define a bag as Bag(X) = X — > N. By putting all 
the elements of a bag in a list we get a representation of the bag but of course there are 
many such representations since there is no order of elements in the bag but there is in a 
list. Thus we get a bijection between bags and lists quotiented out by all rearrangements 
of positions. Hence 

Example 3.7 The bag type is the quotient container (S > P/G ) where (S i> P) is the 
container for lists and, for n : N, we take G(n) to be the set of all isomorphisms on the 
set {0, ... — 1}. 
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4 Programming with Quotient types 

We have identified a class of quotient data structures and axiomatised them as 
quotient containers. These quotient containers can be seen as datatype declarations in 
a programming language and so we now ask how we program polymorphically with 
these quotient containers. 

Recall a container morphism (S > P) — > (Q > R) consisted of a translation of shapes 
w.S-^Q and, for each s £ S, a map f s : R(us) — > P(s) sending positions in the output to 
positions in the input. If we now ask what is a map of quotient containers (S > P/G) — > 
(Q> R/H) its reasonable to require a map (u. f ) : (S i> P) — > (Q > R) of the underlying 
containers which takes into account the respective quotients. There are two issues: 

• Since the maps f s : Rius) — > P(s) are labellings, the quotient given by H says that 
a quotient container morphism (u,f) is the same as another quotient container 
morphism ( u,f ') if for each s £ S, there is an h s £ H(us) such that f s = f s o h s . 

• Given a map f s : R(us) — > P(s) and a g £ G(s ) then the labellings / and gof should 
be regarded as equal as labellings of R(us). Hence there should be an h g £ H(us) 
such that / o h g = g o f. 

Hence we define 

Definition 4.1 (Quotient Container Morphism). A pre-morphism of quotient contain- 
ers ( S > P/G) — > (<2 > R/H ) is a morphism of the underlying containers ( u,f ) : ( S > 
P) — >■ (<2 > R) such that for each s £ S and each g £ G(s), there is an h g £ H(us) such 
that 



R(us) 




P(s) 



hg 

R(us) 



8 

¥ 




P(s) 



The morphisms ( S > P/G) — > (Q > R/H) are the premorphisms quotiented by the 
equivalence relation 

( u,f ) ~ (m,/ / ) iff for all s £ S, there exists h s £ H{us) such that f s — f' s 0 h s 

Intuitively, the first condition is precisely the naturality of the quotient container 
morphism while the second reflects the fact that labellings are defined upto quotient. 
Is this a good definition of a polymorphic program between quotient containers? We 
answer this in two ways. On a theoretical level we show that all polymorphic programs 
can be uniquely captured by such quotient container morphisms while on a practical 
level we demonstrate a number of examples. 

Lemma 4.2. The quotient container morphisms (S>P/G) — > (Q i> R/H ) are in one-to- 
one bijection with natural transformations K\ — > Tq > r / h P where in the latter we regard 
P : S Sets as a presheaf as described in lemma 2.4. 
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Proof Such natural transformations are exactly ,S'-indexed maps K \ (s) — > Tq !>r / h P(s) 
which are natural in S. Thus we have, for each s £ S, maps 

1 -> £. q:Q . (R(q)^P(s))/~ q 

which are natural in S. Such a family of ^-indexed maps is exactly a map u : S — > Q and 
an S-indexed family of elements of ( R(us ) — * P(s))/ natural in S. An element of 
(R(us) — > P(s))/ ~„ s is clearly an equivalence class of functions while the naturality 
corresponds to the commuting diagram above. □ 



Theorem 4.3. Quotient container morphisms are in one-to-one bijection with natural 
transformations between their extensions. Hence there is a full and faithful embedding 

T : QCont— > [Sets, Sets], 

Proof Natural transformations T St>P / c —> Tq >r ii are by lemma 3.6 exactly natural 
transformations L a n /■ A" | — ■> Lan^/Q . By the universal property of Kan extensions 
given in equation 1, these are in one-to-one bijection with natural transformations 
K\ — > Tq > r / h P which by lemma 4.2 are in one-to-one bijection with quotient container 
morphisms (S >P/G) -a- (Q t>R/H). □ 

Notice the key role of the left Kan extension here. It identifies a universal property 
of the extension of a quotient container which is exactly what is required to prove 
Theorem 4.3. Also note that Theorem 2.9 is a corollary as there is clearly a full and 
faithful embedding of Cont into QCont. 

We now finish with some examples. Firstly, if ( S o P/G) is a container and for 
each s £ S the group G(s) is a subgroup of H(s), then there is a morphism of quotient 
containers (5 o P/G) — > (S > P/H ). Thus 

Example 4.4 The canonical maps List(Z) — » CList(X) — > Bag(A) are polymorphic as, 
for a given n £ N, they arise as container morphisms from the inclusions of the singleton 
group into the group of functions {A i.i + k mod n\k £ {0, ...,n— 1}} and of this group 
into the group of all isomorphisms on {0, . . . ,n — 1}. 

Example 4.5 Every datatype given by a quotient container (S t> P/G) has a map 
operation which is given by the action of functor T S>P j G on morphisms. 

Example 4.6 We can extend the operation of reversing a list to a reverse on cyclic 
lists. One simply needs to check the commutation condition that for each size n and 
k £ {0, . . . , n — 1 }, there is a k' such that 

Xi.(i + k) modn o Xi.n — ; — 1 = Xi.n— i— 1 o Xifi + lf) modn 

Of course we simply take k' = n — k — 1. 



In general this shows how simple our theory is to apply in practice. We simply have one 
condition to check! 
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5 Conclusions and Further Work 

We have provided a synthesis of polymorphism with quotient datatypes in such a way 
as to facilitate programming and reasoning with such quotient structures. This work 
is based upon an extension of our previous work on containers which, going further 
than shapely types, present datatypes via a collection of shapes and, for each shape, a 
collection of positions where data may be stored. The treatment of quotient datatypes 
is structural in that the quotient is determined by a collection of isomorphisms on these 
position sets which induce a quotient on the labellings of positions with data. On top 
of this presentation of datatypes, we have provided a means for programming with 
such structures by defining morphisms of quotient containers. These are essentially 
morphisms of the underlying containers which respect the relevant quotients. This 
simple axiomatisation is proven correct in that we show that these morphisms determine 
exactly the polymorphic programs between these quotient data structures. 

As for further work, we believe containers are an excellent platform for generic 
program and we wish to develop this application of containers. With specific 
relationship to these quotient containers, we have begun investigating their application 
to programming in the Theory of Species which was Joyal’s motivation for developing 
analytic functors. From a more practical perspective, we would like to increase the 
examples of programs covered by both containers and quotient containers to include, 
for example, searching and sorting algorithms. Note such programs are not strictly 
polymorphic as their result depends upon inspection of the data. Of course this can 
already be done concretely by accessing the data in a container via the labellings. 
However, a greater challenge is to describe the degree to which such algorithms are 
polymorphic. In Haskell, one uses type classes so we would be looking for a container- 
theoretic approach to type classes. 
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Abstract. Generic functions are defined by induction on the struc- 
tural representation of types. As a consequence, by defining just a sin- 
gle generic operation, one acquires this operation over any particular 
type. An instance on a specific type is generated by interpretation of the 
type’s structure. A direct translation leads to extremely inefficient code 
that involves many conversions between types and their structural rep- 
resentations. In this paper we present an optimization technique based 
on compile-time symbolic evaluation. We prove that the optimization 
removes the overhead of the generated code for a considerable class of 
generic functions. The proof uses typing to identify intermediate data 
structures that should be eliminated. In essence, the output after opti- 
mization is similar to hand-written code. 



1 Introduction 

The role of generic programming in the development of functional programs is 
steadily becoming more important. The key point is that a single definition of 
a generic function is used to automatically generate instances of that function 
for arbitrarily many types. These generic functions are defined by induction on 
a structural representation of types. Adding or changing a type does not re- 
quire modifications in a generic function; the appropriate code will be generated 
automatically. This eradicates the burden of writing similar instances of one 
particular function for numerous different data types, significantly facilitating 
the task of programming. Typical examples include generic equality, mapping, 
pretty-printing, and parsing. 

Current implementations of generic programming [AP01,CHJ + 02,HP01], 
generate code which is strikingly slow because generic functions work with struc- 
tural representations rather than directly with data types. The resulting code 
requires numerous conversions between representations and data types. Without 
optimization automatically generated generic code runs nearly 10 times slower 
than its hand- written counterpart. 

In this paper we prove that compile-time ( symbolic ) evaluation is capable of 
reducing the overhead introduced by generic specialization. The proof uses typing 
to predict the structure of the result of a symbolic computation. More specifically, 
we show that if an expression has a certain type, say cr, then its symbolic normal 
form will contain no other data-constructors than those belonging to cr. 



D. Kozen (Ed.): MPC 2004, LNCS 3125, pp. 16-31, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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It appears that general program transformation techniques used in current 
implementations of functional languages are not able to remove the generic over- 
head. It is even difficult to predict what the result of applying such transforma- 
tions on generic functions will be, not to mention a formal proof of completeness 
of these techniques. 

In the present paper we are looking at generic programming based on the 
approach of kind-indexed types of Hinze [HinOOa] , used as a basis for the imple- 
mentation of generic classes of Glasgow Haskell Compiler (GHC) [HP01], Generic 
Haskell [CHJ + 02] and Generic Clean [AP01]. The main sources of inefficiency 
in the generated code are due to heavy use of higher-order functions, and con- 
versions between data structures and their structural representation. For a large 
class of generic functions, our optimization removes both of them, resulting in 
code containing neither parts of the structural representation (binary sums and 
products) nor higher-order functions introduced by the generic specialization 
algorithm. 

The rest of the paper is organized as follows. In section 2 we give motiva- 
tion for our work by presenting the code produced by the generic specialization 
procedure. The next two sections are preliminary; they introduce a simple func- 
tional language and the typing rules. In section 5, we extend the semantics of 
our language to evaluation of open expressions, and establish some properties 
of this so-called symbolic evaluation. In section 6 we discuss termination issues 
of symbolic evaluation of the generated code. Section 7 discusses related work. 
Section 8 reiterates our conclusions. 

2 Generics 

In this section we informally present the generated code using as an example 
the generic mapping specialized to lists. The structural representation of types 
is made up of just the unit type, the binary product type and the binary sum 
type [Hin99l: 

data 1 =1 

data a x /3 = (a, (3) 

data a + /3=lnla|lnr/3 

For instance, the data types 

data List a — Nil | Cons a (List a) 

data Tree a (3 = Tip a | Bin (3 (Tree a (3) (Tree a (3) 

are represented as 

type List 0 a =l + ax (List a) 

type Tree 0 a (3 = a + (3 x Tree a (3 x Tree a (3 

Note that the representation of a recursive type is not recursive. 

The structural representation of a data type is isomorphic to that data type. 
The conversion functions establish the isomorphism: 
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t°Li s t : List a — > List 0 a 
toi jci = XL case l of 

Nil -► Ini 1 

Cons x xs — » lnr(x,xs) 

from List ■ List 0 a List a 
from List = -M-case l of 

Ini u — + case u of 1 — > Nil 

Inrp — * case p of (x, xs) — > Cons x xs 

The generic specializer automatically generates the type synonyms for structural 
representations and the conversion functions. 

Data types may contain the arrow type. To handle such types the conversion 
functions are packed into embedding-projection pairs [HP01] 

data a <=t [3 = EP (a — > /?) (/? — > a) 

The projections, the inversion and the (infix) composition of embedding-pro- 
jections are defined as follows: 

to : (a (3) — » (a — > (3) 

to = Ax. case x of EP t f — > t 
from : (a <=t /?) — » (/3 — > a) 
from = Ax. case x of EP tf -+ f 
inv : (a <=t (3) — * (/3 <=t a) 
inv = Ax.EP (from x) (to x) 

• : (P *=* 7) (« *=* P) — (a ^ 7) 

• = Aa.A6.EP (to a o to 6) (from 6 o from a) 

For instance, the generic specializer generates the following embedding-pro- 
jection pair for lists: 

conv List • List a ^ List 0 a 
conv List = EP t0 List from List 

To define a generic ( polytypic ) function the programmer provides the basic 
poly-kinded type [HinOOb] and the instances on the base types. For example, the 
generic mapping is given by the type 

type Map a (3 = a — > (3 

and the base cases 

rnapj : 1 — > 1 

mapj = Ax. case x of 1 — > 1 



map x 


: 'ia.\oiif3\f3i.(oi\ 


-Pi) 


->■ 


(«2 


— ► P 2 ) 


-> 


(«i x a 2 - 


-Pi 


x (3 2 ) 


map x = 


= Xf.Xg.Xp.case p 


of (x, 


2/) 


-> 


(/ a:, 5 


y) 








map + 


: 'icL\a.if3\f3i.(oi\ 


-Pi) 


-»■ 


(a 2 


— *■ # 2 ) 


-»■ 


(«i + 0:2 - 


-Pi 


+ P 2 ) 


map + = 


= Xf.Xg.Xe.case e 


of 


















Ini x — ► 


Ini (/ 


x) 
















Inr y -> 


Inr (g 


y) 
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The generic specializer generates the code for the structural representation T° 
of a data type T by interpreting the structure of T°. For instance, 

ma P|_ist° : (a — * 0) — > List 0 a — > List 0 0 
map L ist° = A/.map + map t (map x / (map List /)) 

Note that the structure of mapu^o reflects the structure of List 0 . 

The way the arguments and the result of a generic function are converted 
from and to the structural representation depends on the base type of the generic 
function. Embedding-projections are used to devise the automatic conversion. 
Actually, embedding-projections form a predefined generic function that is used 
for conversions in all other generic functions (e.g. map) [HinOOa]. The type of 
this generic function is a 0 and the base cases are 



ePi 


Tl 










ePi = 


= EP mapj mapj 








ep+ 


: (ai 02) 




P2) — * 


• (ai + 0 i ^ a 2 + 0 2 ) 




ep+ = 


= Aa.A6.EP (1 


map + (to 


a) (to 


b )) (map + (from a) (from 


b)) 


e Px 


: (ai a 2 ) 


->( 0 i^ 


P2) — 5 


• (ai x 0 i a 2 x 02) 




e Px = 


= Aa.A6.EP (1 


map x (to 


a) (to 


b)) (map x (from a) (from 


b)) 


ep^ 


: (au a 2 ) 


— (/ 3 i ^ 


P2) — 5 


• ((on -> 0 !) «=* (a 2 -> 02)) 




e P-+ = 


= Aa.A6.EP (A/. to b 0 


/ 0 


from a) (A/.from 60/0 


to a) 


e Pi=> 


: (ai a 2 ) 


— * (Pi ^ 


P2) —> 


• ((on «=* 0 i) +± (a 2 «=* 0 2 )) 




ep^ = 


= Aa.A6.EP (A e.b • e 


• inv 


a) (Ae.inv b • e • a) 





The generic specializer generates the instance of ep specific to a generic func- 
tion. The generation is performed by interpreting the base (kind-indexed) type 
of the function. For mapping (with the base type Map a 0) we have: 

e PMap : ( ai ^ “ 2 ) ^ $ 2 ) (( a i /?i) ^ ( a 2 — ► 0 2 )) 

e PMap = Aa.A6.ep_^ a b 

Now there are all the necessary components to generate the code for a generic 
function specialized to any data type. In particular, for mapping on lists the 
generic specializer generates 

map|_j st : (a — » (3) — * List a — > List (3 

mapust = from (ep Map conv Ljst conv List ) o map List . 

This function is much more complicated than its hand-coded counterpart 

mapList = A/.ALcase l of 
Nil -► Nil 

Cons x xs — * Cons (/ x) (mapLj st / xs) 

The reasons for inefficiency are the intermediate data structures for the struc- 
tural representation and extensive usage of higher-order functions. In the rest 
of the paper we show that symbolic evaluation guarantees that the intermedi- 
ate data structures are not created by the resulting code. The resulting code is 
comparable to the hand- written code. 
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3 Language 

In the following section we present the syntax and operational semantics of a 
core functional language. Our language supports essential aspects of functional 
programming such as pattern matching and higher-order functions. 

3.1 Syntax 

Definition 1 (Expressions and Functions) 

a) The set of expressions is defined by the following syntax. In the definition, 
x ranges over variables, C over constructors and F over function symbols. 
Below the notation a stands for (ai, . . . , ctfe). 

E x | C E | F | A x.E \ E E' \ case E of Pi — » E\ ■ ■ ■ P n — > E n 
P ::= Cx 

b) A function definition is an expression of the form F = Ep with FV(Fp) = 0. 
With FV(E) we denote the set of free variables occurring in E. 

The distinction between applications (expressions) and specifications (func- 
tions) is reflected by our language definition. Expressions are composed from 
applications of function symbols and constructors. Constructors have a fixed 
arity, indicating the number of arguments to which they are applied. Partially 
applied constructors can be expressed by A-expressions. A function expression is 
applied to an argument expression by an (invisible, binary) application operator. 
Finally, there is a case-construction to indicate pattern matching. Functions are 
simply named expressions (with no free variables). 



3.2 Semantics 

We will describe the evaluation of expressions in the style of natural operational 
semantics, e.g. see [NN92]. The underlying idea is to specify the result of a 
computation in a compositional, syntax-driven manner. 

In this section we focus on evaluation to normal form (i.e. expressions being 
built up from constructors and A-expressions only). In section 5, we extend this 
standard evaluation to so-called symbolic evaluation: evaluation of expressions 
containing free variables. 

Definition 2 (Standard Evaluation) 

Let E,N be expressions. Then E is said to evaluate to N (notation E JJ. N) if 
E if N can be produced in the following derivation system. 



Elf N F = Ef E f If N 

A x.E JJ- A x.E (E-X) — (E-cons) _ (E-fun) 



CE If CN 



F If N 



Elf CiE Di[x:= E] If N E If Xx.E" E"[x := E'] JJ- N 

( E-case ) (E-app) 



case E of .. . C iX Di ... If N 



E E' If N 
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Here E[x := E'] denotes the term that is obtained when x in E is substituted 
byE'. 

Observe that our evaluation does not lead to standard normal forms (ex- 
pressions without redexes): if such an expression contains As, there may still be 
redexes below these As. 

4 Typing 

Typing systems in functional languages are used to ensure consistency of function 
applications: the type of each function argument should match some specific 
input type. In generic programming types also serve as a basis for specialization. 
Additionally, we will use typing to predict the constructors that appear in the 
result of a symbolic computation. 



Syntax of Types 

Types are defined as usual. We use V-types to express polymorphism. 

Definition 3 (Types) 

The set of types is given by the following syntax. Below , a ranges over type 
variables, and T over type constructors. 

cr, r a | T | a — >t \ o t \ \/a.o 

We will sometimes use o^>t as a shorthand for o\— > . . . — U7fc— >r. The set of free 
type variables of o is denoted by FV(cr). 

The main mechanism for defining new data types in functional languages is 
via algebraic types. 

Definition 4 (Type environments) 

a) Let A be an algebraic type system, i.e. a collection of algebraic type def- 
initions. The type specifications in A give the types of the algebraic data 
constructors. Let 

T a = ■ ■ • | Cj di | • • • 

be the specification of T in A. Then we write 

A b Cj : Vo.cTi— >T a. 

b) The function symbols are supplied with a type by a function type environ- 
ment IF, containing declarations of the form F : cr. 

For the sequel, fix a function type environment IF, and an algebraic type 
system A. 
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Type Derivation 

Definition 5 (Type Derivation) 

a) The type system deals with typing statements of the form 

B h E : o, 

where B is a type basis (i.e a finite set of declarations of the form x : t). 
Such a statement is valid if it can be produced using the following derivation 
rules. 




b) The function type environment T is type correct if each function definition is 
type correct, i.e. for F with type o and definition F = Ep one has 0 b Ep : o. 

5 Symbolic Evaluation 

The purpose of symbolic evaluation is to reduce expressions at compile-time, for 
instance to simplify the generated mapping function for lists (see section 2). 

If we want to evaluate expressions containing free variables, evaluation cannot 
proceed if the value of such a variable is needed. This happens, for instance, if a 
pattern match on such a free variable takes place. In that case the corresponding 
case-expression cannot be evaluated fully. The most we can do is to evaluate 
all alternatives of such a case-expression. Since none of the pattern variables 
will be bound, the evaluation of these alternatives is likely to get stuck on the 
occurrences of variables again. 

Symbolic evaluation gives rise to a new (extended) notion of normal form, 
where in addition to constructors and A-expressions, also variables, cases and 
higher-order applications can occur. This explains the large number of rules 
required to define the semantics. 
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Definition 6 (Symbolic Evaluation) We adjust definition 2 of evaluation by 
replacing the E- A rule, and by adding rules for dealing with new combinations 
of expressions. 




Note that the rules (E-case) and (E-app) from definition 2 are responsible 
for removing constructor-destructor pairs and applications of the lambda-terms. 
These two correspond to the two sources of inefficiency in the generated pro- 
grams: intermediate data structures and higlrer-order functions. The rules (E- 
case-case) and (E-app-case) above are called code-motion rules [DMP96]: their 
purpose is to move code to facilitate further transformations. For instance, the 
(E-case-case) rule pushes the outer case in the alternatives of the inner case 
in hope that an alternative is a constructor. If so, the (E-case) rule is applica- 
ble and the intermediate data are removed. Similarly, (E-app-case) pushes the 
application arguments in the case alternatives hoping that an alternative is a 
lambda-term. In this case (E-app) becomes applicable. 

Example 7 (Symbolic Evaluation) Part of the derivation tree for the eval- 
uation of the expression map x /i g\ (map x / 2 gi p) is given below. The function 
map x is defined in section 2. 

map x f 2 32 p JJ- case (/ 2 x',g 2 y ') of 

case p of ( x,y ) -> (/i x,gi y) JJ. 

(x',y') -> ( f 2 x’,g 2 y’) (fi (f 2 x'),gi (g 2 y')) 
map x JJ. case map x / 2 g 2 p of 

Xf.Xg.Xp.case p of (x,y) -» (/i x,gi y) JJ. 

jx,y) -> (/ x,g y) case p of [x' , y') -» (/i (/ 2 x'),gi (g 2 y')) 

ma P x fi gi (map x / 2 g 2 p) JJ. case p of (x',y) -* (fi (f 2 x'),gi (g 2 y')) 
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The following definition characterizes the results of symbolic evaluation. 

Definition 8 (Symbolic Normal Forms) The set of symbolic normal forms 
(indicated by N s ) is defined by the following syntax. 

N s ::= C N a | A x.N s \ N h \ case N h of ■ ■ ■ P t N s ■ ■ ■ 

N h a- | N h N s 



Proposition 9 (Correctness of Symbolic Normal Form) 

E If N => N € N s 

Proof: By induction on the derivation oi E If N. □ 



5.1 Symbolic Evaluation and Typing 

In this subsection we will show that the type of an expression (or the type of a 
function) can be used to determine the constructors that appear (or will appear 
after reduction) in the symbolic normal form of that expression. Note that this 
is not trivial because an expression in symbolic normal form might still contain 
potential redexes that can only be determined and reduced during actual evalu- 
ation. Recall that one of the reasons for introducing symbolic evaluation is the 
elimination of auxiliary data structures introduced by the generic specialization 
procedure. 

The connection between evaluation and typing is usually given by the so- 
called subject reduction property indicating that typing is preserved during re- 
duction. 



Proposition 10 (Subject Reduction Property) 

B\- E : cr, E If N => BhiVid 

Proof: By induction on the derivation oi E If N. □ 

There are two ways to determine constructors that can be created during the 
evaluation of an expression, namely, (1, directly) by analyzing the expression 
itself or (2, indirectly) by examining the type of that expression. 

In the remainder of this section we will show that (2) includes all the con- 
structors of (1), provided that (1) is determined after the expression is evaluated 
symbolically. The following definition makes the distinction between the different 
ways of indicating constructors precise. 



Definition 11 (Constructors of normal forms and types) 

— Let N be an expression in symbolic normal form. The set of constructors 
appearing in N (denoted as Cn(N)) is inductively defined as follows. 



C n (CN) 

C n (Xx.N) 

C N (x) 

C n (N N') 

Crease N of ••• Pi —> Ni ■■ ■) 



{C}U C n (N) 

C n (N) 

0 

C n (N) U Cn(N') 
Cn(N) U (UiCwfNi)) 
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Here Cn(N ) should be read as UjCjv(-ZVj). 

— Let o be a type. The set of constructors in a (denoted as Ct(ct) ) is inductively 
defined as follows. 

Ct(oi) = 0 

C't(T) = Ui[{Ci} U C T (<Ji)\, where T = • • • | Cjffi | • • • 

C t (t — >cr) = C’T (t) U Ct (<f ) 

Ct(t ct) = Ct (t) U Ct (ct) 

Ct^oc.o) = Ct(ct) 

— Let B be a basis. By Ct(B) we denote the set UC't(ct) for each x : a € B. 
Example 12 For the List type from section 2 and for the Rose tree 

data Rose a = Node a (List (Rose a)) 

we have CV(List) = {Nil, Cons} and Cr(Rose) = {Node, Nil, Cons}. 

As a first step towards a proof of the main result of this section we concentrate 
on expressions that are already in symbolic normal form. Then their typings give 
a safe approximation of the constructors that are possibly generated by those 
expressions. This is stated by the following property. In fact, this result is an 
extension of the Canonical Normal Forms Lemma, e.g. see [Pie02]. 

Proposition 13 Let N £ N s . Then 

B b N : er Cn(N) C Ct(B) U Ct(ct). 

Proof: By induction on the structure of N s . □ 

The main result of this section shows that symbolic evaluation is adequate 
to remove constructors that are not contained in the typing statement of an 
expression. For traditional reasons we call this the deforestation property. 

Proposition 14 (Deforestation Property) 

BL E:a,ElfN => C n (N) C C t (B) U C t {<j) 

Proof: By proposition 9, 13, and 10. □ 

5.2 Optimising Generics 

Here we show that, by using symbolic evaluation, one can implement a compiler 
that for a generic operation yields code as efficient as a dedicated hand coded 
version of this operation. 

The code generated by the generic specialization procedure is type correct 
[HinOOa]. We use this fact to establish the link between the base type of the 
generic function and the type of a specialized instance of that generic function. 




26 



Artem Alimarine and Sjaak Smetsers 



Proposition 15 Let g be a generic function of type a, T a data-type, and let 
gT be the instance of g on T. Then g t is typeable. Moreover, there are no other 
type constructors in the type of gT than T itself or those appearing in a. 

Proof: See [AS03]. □ 

Now we combine typing of generic functions with the deforestation property 
leading to the following. 

Proposition 16 Let g be a generic function of type a, T a data-type, and let 
gT be the instance of g on T. Suppose gT l) N. Then for any data type S one has 

S i a,T => Ct{ S) n C n {N) = 0. 

Proof: By proposition 14, 10, and 15. □ 

Recall from section 2 that the intermediate data introduced by the generic 
specializer are built from the structural representation base types {x, +, 1, <^}. 
It immediately follows from the proposition above that, if neither a nor T con- 
tains a structural representation base type S, then the constructors of S are not 
a part of the evaluated right-hand side of the instance gr- 

6 Implementation Aspects: 

Termination of Symbolic Evaluation 

Until now we have avoided the termination problem of the symbolic evaluation. 
In general, this termination problem is undecidable, so precautions have to be 
taken if we want to use the symbolic evaluator at compile-time. It should be 
clear that non-termination can only occur if some of the involved functions are 
recursive. In this case such a function might be unfolded infinitely many times 
(by applying the rule (E-fun)). The property below follows directly form propo- 
sition 16. 

Corollary 17 (Efficiency of generics) Non-recursive generic functions can 
be implemented efficiently. More precisely, symbolic evaluation removes inter- 
mediate data structures and functions concerning the structural representation 
base types. 

The problem arises when we deal with generic instances on recursive data 
types. Specialization of a generic function to such a type will lead to a recursive 
function. For instance, the specialization of map to List contains a call to map|j S 4.° 
which, in turn, calls recursively rnapu s t- We can circumvent this problem by 
breaking up the definition into a non-recursive part and to reintroduce recursion 
via the standard fixed point combinator Y = X f.f(Yf). Then we can apply 
symbolic evaluation to the non-recursive part to obtain an optimized version of 
our generic function. The standard way to remove recursion is to add an extra 
parameter to a recursive function, and to replace the call to the function itself 
by a call to that parameter. 
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Example 18 (Non-recursive specialization) The specialization of map to 
List without recursion: 

map'iist = Am. from (ep^ conv |_ist convy^) o 

(A/.map + map £ (map x / ("i /))) 

ma PList = yma PList 

After evaluating ma P[_j st symbolically we get 

map'ii = Xm.Xf.Xx. case x of 
Nil -► Nil 

Cons y ys —> Cons (/ y) (m f ys) 

showing that all intermediate data structures are eliminated. 

Suppose the generic instance has type r. Then the non-recursive variant (with 
the extra recursion parameter) will have type r — > r, which obviously has the 
same set of type constructors as r. 

However, this way of handling recursion will not work for generic func- 
tions whose base type contains a recursive data type. Consider for example the 
monadic mapping function for the list monad mapl with the base type 

type Mapl a (3 = a — > List (3 



and the base cases 



maplj 


: 1 — > List 1 








maplj = 


= return 1 








mapl x 


: Vaia2/3i/32-(cri — > List /3i) — 


■> (a 2 - 


List (3 2 ) — > ol\ 


X OL 2 




List ((3i x (3 2 ) 








mapl x = 


= Xf.Xg.Xp.case p of (x, y) —> 


f x >= Xx'.g y >= 


Xy' . return ( x' ,y ') 


mapl + 


: Votitt2f3if32-{ ( xi — > List /3i) — 


■> (a 2 - 


-> List (3 2 ) — > ai 


+ a 2 




->■ List (/?i + (3 2 ) 








mapl + = 


= Xf.Xg.Xe.case e of 










Ini x — > / x ^>= Ax' 


.return 


(Ini x') 






Inr y -»■ g y >= A y' . 


return 


(Inr y') 





where 



return = Ax. Cons x Nil 
(»=) = AZ.A/.flatten (map / l) 



are the monadic return and (infix) bind for the list monad. The specialization 
of mapl to any data type, e.g. Tree, uses the embedding-projection specialized to 
Mapl (see section 2). 



maplj ree : (a — > List (3) — > Tree a — > List (Tree (3) 
mapl Tree = from (ep Map , conv Tree conv Tree ) o mapl Tree o 
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The embedding-projection ep|y] a p| 

e PMapl : (“i ^ “ 2 ) — ► (/5i ^ P 2 ) — ► ((<*1 — ► List pi) <=t (a 2 -> List /3 2 )) 
e PMapl = Aa.Afe.ep^ a (ep List 6) 

contains a call to the (recursive) embedding-projection for lists epu s t 

e PUst : ( a P) ~ * (List a <=t List P) 

e PList = from ( e P^ conv List conv List) 0 e PList° 
e P|_ist° : (q P) — ■> (List 0 a <=± List 0 p) 

e PList° = A /- e P+ e Pi ( e Px / ( e PList /)) 

We cannot get rid of this recursion (using the y-combinator) because it is not 
possible to replace the call to e P[_j s t * n e Pmapl a call to a non-recursive variant 
of epLj^ and to reintroduce recursion afterwards. 

Online Non-termination Detection 

A way to solve the problem of non-termination is to extend symbolic evaluation 
with a mechanism for so-called online non-termination detection. A promising 
method is based on the notion of homeomorphic embedding (HE) [Leu98]: a 
(partial) ordering on expressions used to identify ‘infinitely growing expressions’ 
leading to non-terminating evaluation sequences. Clearly, in order to be safe, 
this technique will sometimes indicate unjustly expressions as dangerous. We 
have done some experiments with a prototype implementation of a symbolic 
evaluator extended with termination detection based on HEs. It appeared that in 
many cases we get the best possible results. However, guaranteeing success when 
transforming arbitrary generics seems to be difficult. The technique requires 
careful fine-tuning in order not to pass the border between termination and 
non-termination. This will be a subject to further research. 

In practice, our approach will handle many generic functions as most of them 
do not contain recursive types in their base type specifications, and hence, do not 
require recursive embedding-projections. For instance, all generic functions in the 
generic Clean library (except the monadic mapping) fulfill this requirement. 

7 Related Work 

The generic programming scheme that we use in the present paper is based on 
the approach by Hinze[Hin00a]. Derivable type classes of GHC [HP01], Generic 
Haskell [CHJ+02] and Generic Clean [AP01] are based on this specialization 
scheme. We believe symbolic evaluation can also be used to improve the code 
generated by PolyP [JJ97]. The authors of [HP01] show by example that in- 
lining and standard transformation techniques can get rid of the overhead of 
conversions between the types and their representations. The example presented 
does not involve embedding-projections and only treats non-recursive conver- 
sions from a data type to its generic representation. In contrast, our paper gives 
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a formal treatment of optimization of generics. Moreover, we have run GHC 
6.0.1 with the maximum level of optimization (-02) on derived instances of the 
generic equality function: the result code was by far not free from the structural 
representation overhead. 

Initially, we have tried to optimize generics by using deforestation [Wad88] 
and fusion [Chi94,AGS03]. Deforestation is not very successful because of its 
demand that functions have to be in treeless form. Too many generic functions 
do not meet this requirement. But even with a more liberal classification of 
functions we did not reach an optimal result. We have extended the original 
fusion algorithm with so-called depth analysis [CK96], but this does not work 
because of the producer classification : recursive embedding-projections are no 
proper producers. We also have experimented with alternative producer classi- 
fications but without success. Moreover, from a theoretical point of view, the 
adequacy of these methods is hard to prove. [Wad88] shows that with deforesta- 
tion a composition of functions can be transformed to a single function without 
loss of efficiency. But the result we are aiming at is much stronger, namely, all 
overhead due to the generic conversion should be eliminated. 

Our approach based on symbolic evaluation resembles the work that has been 
done on the field of compiler generation by partial evaluation. E.g., both [ST96] 
and [Jp92] start with an interpreter for a functional language and use partial 
evaluation to transform this interpreter into a more or less efficient compiler or 
optimizer. This appears to be a much more general goal. In our case, we are very 
specific about the kind of results we want to achieve. 

Partial evaluation in combination with typing is used in [DMP96,Fil99,AJ01] . 
They use a two-level grammar to distinguish static terms from dynamic terms. 
Static terms are evaluated at compile time, whereas evaluation of dynamic terms 
is postponed to run time. Simple type systems are used to guide the optimization 
by classifying terms into static and dynamic. In contrast, in the present work 
we do not make explicit distinction between static and dynamic terms. Our 
semantics and type system are more elaborate: they support arbitrary algebraic 
data types. The type system is used to reason about the result of the optimization 
rather than to guide the optimization. 

8 Conclusions and Future Work 

The main contributions of the present paper are the following: 

— We have introduced a symbolic evaluation algorithm and proved that the 
result of the symbolic evaluation of an expression will not contain data con- 
structors not belonging to the type of that expression. 

— We have shown that for a large class of generic functions symbolic evaluation 
can be used to remove the overhead of generic specialization. This class 
includes generic functions that do not contain recursive types in their base 
type. 

Problems arise when involved generic function types contain recursive type 
constructors. These type constructors give rise to recursive embedding projec- 
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tions which can lead to non-termination of symbolic evaluation. We could use 
fusion to deal with this situation but then we have to be satisfied with a method 
that sometimes produces less optimal code. It seems to be more promising to 
extend symbolic evaluation with online termination analysis, most likely based 
on the homeomorphic embedding [Leu98]. We already did some research in this 
area but this has not yet led to the desired results. 

We plan to study other optimization techniques in application to generic 
programming, such as program transformation in computational form [TM95]. 
Generic specialization has to be adopted to generate code in computational form, 
i.e. it has to yield hylomorphisms for recursive types. 

Generics are implemented in Clean 2.0. Currently, the fusion algorithm of 
the Clean compiler is used to optimize the generated instances. As stated above, 
for many generic functions this algorithm does not yield efficient code. For this 
reason we plan to use the described technique extended with termination analysis 
to improve performance of generics. 
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Abstract. Datatypes which differ inessentially in their names and struc- 
ture are said to be isomorphic; for example, a ternary product is isomor- 
phic to a nested pair of binary products. In some canonical cases, the 
conversion function is uniquely determined solely by the two types in- 
volved. In this article we describe and implement a program in Generic 
Haskell which automatically infers this function by normalizing types 
w.r.t. an algebraic theory of canonical isomorphisms. A simple general- 
ization of this technique also allows to infer some non-invertible coercions 
such as projections, injections and ad hoc coercions between base types. 
We explain how this technique has been used to drastically improve the 
usability of a Haskell XML Schema data binding, and suggest how it 
might be applied to improve other type-safe language embeddings. 



1 Introduction 



Typed functional languages like Haskell [27] and ML [16, 25] typically support 
the declaration of user-defined, polymorphic algebraic datatypes. In Haskell, for 
example, we might define a datatype representing dates in a number of ways. 
The most straightforward and conventional definition is probably the one given 
by Date below, 

data Date = Date Int Int Int 

but a more conscientious Dutch programmer might prefer Date_NL: 



data Date_NL 
data Day 
data Month 
data Year 



DateJYL Day Month Year 
Day Int 
Month Int 
Year Int . 



An American programmer, on the other hand, might opt for Date_US, which 
follows the US date format: 



data Date_US = Date-US Month Day Year . 

If the programmer has access to an existing library which can compute with 
dates given as Int-triples, though, he or she may prefer Date2, 

data Date2 = Date2 (Int, Int, Int) , 



D. Kozen (Ed.): MPC 2004, LNCS 3125, pp. 32-53, 2004. 
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for the sake of simplifying data conversion between his application and the li- 
brary. In some cases, for example when the datatype declarations are machine- 
generated, a programmer might even have to deal with more unusual declarations 
such as: 

data Date3 = DateS (Int, (Int, I nt) ) 
data Date4 = Date4 ((Int, Int), Int) 
data Date5 = Date5 (Int, (Int, (Int, ()))) . 

Though these types all represent the same abstract data structure 1 , they rep- 
resent it differently; they are certainly all unequal, firstly because they have 
different names, but more fundamentally because they exhibit different surface 
structures. Consequently, programs which use two or more of these types to- 
gether must be peppered with applications of conversion functions. In this case, 
the amount of code required to define such a conversion function is not so large, 
but if the declarations are machine-generated, or the number of representations 
to be simultaneously supported is large, then the size of the conversion code 
might become unmanageable. In this paper we show how to infer such conver- 
sions automatically. 

1.1 Isomorphisms 

The fact that all these types represent the same abstract type is captured by 
the notion of isomorphism: two types are isomorphic if there exists an invertible 
function between them, our desired conversion function. Besides invertibility, 
two basic facts about isomorphisms (isos for short) are: the identity function is 
an iso, so every type is isomorphic to itself; and, the composition of two isos is 
an iso. Considered as a relation, then, isomorphism is an equivalence on types. 

Other familiar isos are a consequence of the semantics of base types. For 
example, the conversion between meters and miles is a non-identity iso between 
the floating point type Double and itself; (if we preserve the origin), the conver- 
sion between cartesian and polar coordinates is another example. Finally, some 
polymorphic isos arise from the structure of types themselves; for example, one 
often hears that products are associative “up to isomorphism” . 

It is the last sort, often called canonical or coherence (iso)morplrisms, which 
are of chief interest to us. Canonical isos are special because they are uniquely 
determined by the types involved, that is, there is at most one canonical iso 
between two polymorphic type schemes. 

Monoidal Isos. A few canonical isos of Haskell are summarized by the syntactic 
theory below 2 . 

1 We will assume all datatypes are strict; otherwise, Haskell’s non-strict semantics 
typically entails that some transformations like nesting add a new value _L which 
renders this claim false. 

2 We use the type syntax familiar from the Generic Haskell literature, i.e. , Unit and 

are respectively nullary and binary product, and Zero and :+: are respectively 
nullary and binary sum constructors. 
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a :*: Unit = a Unit :*: a = a (a b) c = a (b c) 

a :+: Zero = a Zero :+: a = a (a :+: b) :+: c = a :+: (b :+: c) 

The isomorphisms which witness these identities are the evident ones. The first 
two identities in each row express the fact that Unit (resp. Zero) is a right and 
left unit for :*: (resp. the last says that :*: (resp. :+:) is associative. We call 
these isos collectively the monoidal isos. 

This list is not exhaustive. For example, binary product and sum are also 
canonically commutative: 

a b = b :*: a a :+: b = b :+: a 

and the currying and the distributivity isos are also canonical: 

(a :*: b) — > c = a — » (b — > c) a :*: (b :+: c) = (a :*: b) :+: (a :*: c) 

There is a subtle but important difference between the monoidal isos and the 
other isos mentioned above. Although all are canonical, and so possess unique 
polymorphic witnesses determined by the type schemes involved, only in the case 
of the monoidal isos does the uniqueness property transfer unconditionally to 
the setting of types. 

To see this, consider instantiating the product-commutativity iso scheme to 
obtain: 

Int :*: Int = Int :*: Int . 

This identity has two witnesses: one is the intended twist map, but the other is 
the identity function. 

This distinction is in part attributable to the form of the identities involved; 
the monoidal isos are all strongly regular, that is: 

1. each variable that occurs on the left-hand side of an identity occurs exactly 

once on the right-hand side, and vice versa, and 

2. they occur in the same order on both sides. 

The strong regularity condition is adapted from work on generalized multicat- 
egories [15,14,10]. We claim, but have not yet proved, that strong regularity 
is a sufficient but not necessary - condition to ensure that a pair of types 
determines a unique canonical iso witness. 

Thanks to the canonicality and strong regularity properties, given two types 
we can determine if a unique iso between them exists, and if so can generate 
it automatically. Thus our program infers all the monoidal isos, but not the 
commutativity or distributivity isos; we have not yet attempted to treat the 
currying iso. 

Datatype Isos. In Generic Haskell each datatype declaration effectively induces 
a canonical iso between the datatype and its underlying “structure type”. For 
example, the declaration 



data List a 



Nil | Cons a (List a) 
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induces the canonical iso 

List a = Unit :+: (a :*: List a) . 

We call such isos datatype isos. 

Note that datatype isos are not strongly regular in general; for example the 
List identity mentions a twice on the right-hand side. Intuitively, though, there 
is only one witness to a datatype iso: the constructor (s). Again, we claim, and 
hope in the future to prove, that isos of this sort uniquely determine a canonical 
witness. Largely as a side effect of the way Generic Haskell works, our inference 
mechanism does infer datatype isos. 

1.2 Outline 

The remainder of this article is organized as follows. In section 2 we give an 
informal description of the user interface to our inference mechanism. Section 3 
discusses a significant application of iso inference, a way of automatically cus- 
tomizing a Haskell-XML Schema data binding. In section 4 we examine the 
Generic Haskell implementation of our iso inferencer. Finally, in section 5 we 
summarize our results, and discuss related work and possibilities for future work 
in this area. 

2 Inferring Isomorphisms 

From a Generic Haskell user’s point of view, iso inference is a simple matter 
based on two generic functions, 

reduce{ jt|} :: t — > Univ 

expand{ ]t'|} :: Univ — > t' . 

reduce{ jt[} takes a value of any type and converts it into a universal, normalized 
representation denoted by the type Univ; expand { |t'|}, its dual, converts such a 
universal value back to a ‘regular’ value, if possible. The iso which converts from 
t to t' is thus expressed as: 

expand{ |t'|} o reduce{ jt|} . 

If t = t', then expand {|t'|} and reduce{ |tj} are mutual inverses. If t and t' are 
merely isomorphic, then expansion may fail; it always succeeds if the two types 
are canonically isomorphic, t = t', according to the monoidal and datatype iso 
theories. 

As an example, consider the expression 

(expand{ |(Bool, Bool :+: (Int :+: String)) |} o 
reduce{|(Bool, ((), (Bool :+: Int) :+: String))))) 

(True, ((), Ini (Inr 7))) , 

which evaluates to 



(True, Inr (Ini 7)) . 
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Function reduce { |t|} picks a type in each isomorphism class which serves as a 
normal form, and uses the canonical witness to convert values of t to that form. 
Normalized values are represented in a special way in the abstract type Univ; a 
typical user need not understand the internals of Univ unless expand { |t'|} fails. 
If t and t' are ‘essentially’ the same, yet structurally substantially different then 
this automatic conversion can save the user a substantial amount of typing, time 
and effort. 

Our functions also infer two coercions which are not invertible: 

a b ^ a a ^ a :+: b 

The canonical witnesses here are the first projection of a product and the left 
injection of a sum. Thanks to these reductions, the expression 

(expand^ Either Bool lnt|} o red«ce{|(Bool, I nt) |} ) (True, 4) 

evaluates to Left True ; note that it cannot evaluate to Right 4 because such a 
reduction would involve projecting a suffix and injecting into the right whereas 
we infer only prefix projections and left injections. Of course, we would prefer our 
theory to include the dual pair of coercions as well, but doing so would break 
the property that each pair of types determines a unique canonical witness. 
Nevertheless, we will see in section 3.4 how these coercions, when used with a 
cleverly laid out datatype, can be used to simulate single inheritance. 

Now let us look at some examples which fail. 

1. The conversion 

expand {|( Bool, I nt)|} o redwce{|(lnt, Bool) |} 

fails because our theory does not include commutativity of 

2. The conversion 

expand {| Bool |} o reduce {| I nt|} 

fails because the types are neither isomorphic nor coercible. 

3. The conversion 

expand {| Bool |} o reduce^ Either () Q|} 

fails because we chose to represent certain base types like Bool as abstract: 
they are not destructured when reducing. 

Currently, because our implementation depends on the “universal” type Univ, 
failure occurs at run-time and a message helpful for pinpointing the error’s source 
is printed. In section 5, we discuss some possible future work which may provide 
static error detection. 

3 Improving a Haskell— XML Schema Data Binding 

A program that processes XML documents can be implemented using an XML 
data binding. An XML data binding [23] translates an XML document to a value 
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of some programming language. Such bindings have been defined for a number 
of programming languages including Java [21,24], Python [26], Prolog [7] and 
Haskell [35, 37, 1], The default translation scheme of a data binding may produce 
unwieldy, convoluted and redundant types and values. Our own Haskell-XML 
Schema binding, called UUXML [1], suffers from this problem. 

In this section we use UUXML as a case study, to show how iso inference can 
be used to address a practical problem, the problem of overwhelmingly complex 
data representation which tends to accompany type-safe language embeddings. 
We outline the problem, explain how the design criteria gave rise to it, and finally 
show how to attack it. 

In essence, our strategy will be to define a customized datatype, one chosen 
by the client programmer especially for the application. We use our mechanism to 
automatically infer the functions which convert to and from the customized rep- 
resentation by bracketing the core of the program with reduce {|tj} and expand {|t[}. 
Generic Haskell does the rest, and the programmer is largely relieved from the 
burden imposed by the UUXML data representation. 

The same technique might be used in other situations, for example, compilers 
and similar language processors which are designed to exploit type-safe data 
representations . 



3.1 The Problem with UUXML 

We do not have the space here to describe UUXML in detail, but let us briefly 
give the reader a sense of the magnitude of the problem. 

Consider the following XML schema, which describes a simple bibliographic 
record doc including a sequence of authors, a title and an optional publication 
date, which is a year followed by a month. 

<element name="doc" type="docType"/> 

<complexType name="docType"> 

<sequence> 

<element ref="author" min0ccurs="0" maxOccurs="unbounded"/> 
<element ref ="title"/> 

<element ref ="pubDate" min0ccurs="0"/> 

</ sequence> 

<attribute name="key" type=" string "/> 

</ complexType> 

<element name=" author" type="string"/> 

<element name="title" type="string"/> 

CcomplexType name="pubDateType"> 

<sequence> 

<element ref="year"/> 



<element ref="month 


"/> 


</ sequence> 




</ complexType> 




<element name="pubDate" 


type="pubDateType"/> 


<element name="year" 


type="int"/> 


<element name="month" 


type="int"/> 
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An example document which validates against this schema is: 

<doc key="homer-iliad"> 

<author>Homer</ author> 

<title>The Iliad</title> 

</doc> 

Our binding tool translates each of the types doc and docType into a pair of 
types (explained in the next section), 



data E_doc u 
data LE_E_doc u 
data T.docType u 



data LE_T_docType u 



E.doc (Elem LE_E_doc LE_T_docType u) 
EQ.E.doc (E_doc u) 

T.docType (Seq A_key (Seq (Rep LE_E_author Zl) 
(Seq LE_E_title (Rep LE_E_pubDate 

(ZS ZZ)))) u) 

EQ.E.docType (T_docType u) 

LE _T .publicationType (LE_T_publicationType u) 



and the example document above into: 



EQ_E.doc ( E.doc ( Elem () (EQ JT .docType ( T.docType ( Seq (A_key ( Attr 
( EQ.T .string ( T_string "homer-iliad"))))(S'eg (Rep (ZI [EQ _E .author 
(. E_author ( Elem () (EQ _T _string (T_string "Homer"))))])) (Seq (EQ.E.title 
(E .title (Elem () (EQ _T .string (T .string "The Iliad")))))(f?ep 
(ZS Nothing (Rep ZZ)))))))))) 

Even without knowing the details of the encoding or definitions of the unfa- 
miliar datatypes, one can see the problem here; if a user wants to, say, retrieve 
the content of the author field, he or she must pattern-match against no less 
than ten constructors before reaching "homer-iliad". For larger, more complex 
documents or document types, the problem can be even worse. 



3.2 Conflicting Issues in UUXML 

UUXML’s usability issues are a side effect of its design goals. We discuss these 
here in some depth, and close by suggesting why similar issues may plague other 
applications which process typed languages. 

First, UUXML is type-safe and preserves as much static type information 
as possible to eliminate the possibility of constructing invalid documents. In 
contrast, Java-XML bindings tend to ignore a great deal of type information, 
such as the types of repeated elements (only partly because of the limitations of 
Java collections). 

Second, UUXML translates (a sublanguage of) XML Schema types rather 
than the less expressive DTDs. This entails additional complexity compared 
with bindings such as HaXML [37] that merely target DTDs. For example, XML 
Schema supports not just one but two distinct notions of subtyping and a more 
general treatment of mixed content 3 than DTDs. 

3 “Mixed content” refers to character data interspersed with elements. For example, 
in XHTML a p element can contain both character data and other elements like em. 
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Third, the UUXML translation closely follows the Model Schema Language 
(MSL) formal semantics [4], even going so far as to replicate that formalism’s 
abstract syntax as closely as Haskell’s type syntax allows. This has advantages: 
we have been able to prove the soundness of the translation, that is, that valid 
documents translate to typeable values, and the translator is relatively easy 
to correctly implement and maintain. However, our strict adherence to MSL 
has introduced a number of ‘dummy constructors’ and ‘wrappers’ which could 
otherwise be eliminated. 

Fourth, since Haskell does not directly support subtyping and XML Schema 
does, our binding tool emits a pair of Haskell datatypes for each schema type 
t: an ‘equational’ variant which represents documents which validate exactly 
against t, and a ‘down-closed’ variant, which represents all documents which 
validate against all subtypes of t 4 . Our expectation was that a typical Haskell 
user would read a document into the down-closed variant, pattern-match against 
it to determine which exact/equational type was used, and do the bulk of their 
computation using that. 

Finally, UUXML was intended, first and foremost, to support the develop- 
ment of ‘schema-aware’ XML applications using Generic Haskell. This moniker 
describes programs, such as our XML compressor XComprez [1], which oper- 
ate on documents of any schema, but not necessarily parametrically. XComprez, 
for example, exploits the type information of a schema to improve compression 
ratios. 

Because Generic Haskell works by traversing the structure of datatypes, we 
could not employ methods, such as those in WASH [35], which encode schema 
information in non-structural channels such as Haskell’s type class system. Such 
information is instead necessarily expressed in the structure of UUXML’s types, 
and makes them more complex. 

For schema-aware applications this complexity is not such an issue, since 
generic functions typically need not pattern-match deeply into a datatype. But 
if we aim to use UUXML for more conventional applications, as we have demon- 
strated, it can become an overwhelming problem. 

In closing, we emphasize that many similar issues are likely to arise, not only 
with other data bindings and machine-generated programs, but also with any 
type-safe representation of a typed object language in a metalanguage such as 
Haskell. Preserving the type information necessarily complicates the representa- 
tion. If the overall ‘style’ of the object language is to be preserved, as was our 
desire in staying close to MSL, then the representation is further complicated. 
If subtyping is involved, yet more. If the representation is intended to support 
generic programming, then one is obliged to express as much information as 
possible structurally, and this too entails some complexity. 

For reasons such as these, one might be tempted to eschew type-safe embed- 
dings entirely, but then what is the point of programming in a statically typed 



4 To help illustrate this in the example schema translation, we posited that docType 
had a hypothetical subtype publicationType. It appears as the body of the second 
constructor of LE_T_docType in section 3.1. 
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language if not to exploit the type system? Arguably, the complexity problem 
arises not from static typing itself, but rather the insistence on using only a single 
data representation. In the next section, we show how iso inference drastically 
simplifies dealing with multiple data representations. 

3.3 Exploiting Isomorphisms 

Datatypes produced by UUXML are unquestionably complicated. Let us con- 
sider instead what our ideal translation target might look like. Here is an obvious, 
very conventional, Haskell-style translation image of doc: 

module Doc where 

data Doc = Doc{ key :: String, 

authors :: [String], 
title :: String, 

pubDate :: Maybe PubDate} 

data PubDate = PubDate{ year :: Integer, 

month :: Integer} 



Observe in particular that: 

— the target types Doc and PubDate have conventional, Haskellish names which 
do not look machine-generated; 

— the fields are typed by conventional Haskell datatypes like String, lists and 
Maybe; 

— the attribute key is treated just like other elements; and 

— intermediate ‘wrapper’ elements like title and year have been elided and do 
not generate new types; 

— the positional information encoded in wrappers is available in the field pro- 
jection names; 

— the field name authors has been changed from the element name author, 
which is natural since authors projects a list whereas each author tag wraps 
a single author. 

Achieving an analogous result in Java with a data binding like JAXB would 
require annotating (editing) the source schema directly, or writing a ‘binding 
customization file’ which is substantially longer than the two datatype declara- 
tions above. Both methods also require learning another XML vocabulary and 
some details of the translation process, and the latter uses XPatlr syntax to de- 
note the parts which require customization - a maintenance hazard since the 
schema structure may change. 

With our iso inference system, provided that the document is known to be 
exactly of type doc and not a proper subtype, all that is required is the above 
Haskell declaration plus the following modest incantation: 



expand {| Doc} o redwce{|E_doc|} 
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This expression denotes a function of type E_doc — » Doc which converts the 
unwieldy UUXML representation of doc into the idealized form above. 

For example, the following is a complete Generic Haskell program that reads 
in a doc-conforming document from standard input, deletes all authors named 
“De Sade”, and writes the result to standard output. 



module Censor where 

import UUXML — our framework 

import XDoc — automatically translated XML Schema 

import Doc — the two datatype declarations above 



main 
work 
censor d 
toE_doc 
toDoc 



interact work 

toE_doc o censor o toDoc 

d{authors = filter "De Sade") ( authors d)} 

unparse{ |E_doc[} o expand {|E_doc|} o reduce { | Doc [} 

expand {| Doc |} o reduce{ |E_doc|} o parse{|E_doc|} 



3.4 The Role of Coercions 

Recall that our system infers two non-invertible coercions: 

a b ^ a a ^ a :+: b 

Of course, this is only half of the story we would like to hear! Though we could 
easily implement the dual pair of coercions, we cannot implement them both 
together except in an ad hoc fashion (and hence refrain from doing so). This 
is only partly because, in reducing to a universal type, we have thrown away 
the type information. Even if we knew the types involved, it is not clear, for 
example, whether the coercion a — * a :+: a should determine the left or the right 
injection. 

Fortunately, even this ‘biased’ form of subtyping proves quite useful. In par- 
ticular, XML Schema’s so-called ‘extension’ subtyping exactly matches the form 
of the first projection coercion, as it only allows documents validating against 
a type t to be used in contexts of type s if s matches a prefix of t: so t is an 
extension of s. 

Schema’s other form of subtyping, called ‘restriction’, allows documents val- 
idating against type t to be used in contexts of type s if every document val- 
idating against t also validates against s: so t is a restriction of s. This can 
only happen if s, regarded as a grammar, can be reformulated as a disjunction 
of productions, one of which is t, so it appears our left injection coercion can 
capture part of this subtyping relation as well. 

Actually, due to a combination of circumstances, the situation is better than 
might be expected. First, subtyping in Schema is manifest or nominal , rather 
than purely structural: consequently, restriction only holds between types as- 
signed a name in the schema. Second, our translation models subtyping by gener- 
ating a Haskell datatype declaration for the down-closure of each named schema 
type. For example, the ‘colored point’ example familiar from the object-oriented 
literature would be expressed thus: 
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data Point 
data CPoint 
data LE_Point 

data LE_CPoint 



Point... 

CPoint... 

EQ -Point Point 
LE-CPoint LE_CPoint 
EQ_CPoint CPoint 



Third, we have arranged our translator so that the EQ_. . . constructors always 
appear in the leftmost summand. This means that the injection from the ‘equa- 
tional’ variant of a translated type to its down-closed variant is always the left- 
most injection, and consequently picked out by our expansion mechanism. 



EQ-Point. :: Point LE_Point 

EQ_CPoint :: CPoint — > LE_CPoint 

Since Haskell is, in itself, not so well-equipped at dealing subtyping, when reading 
an XML document we would rather have the coercion the other way around, that 
is, we should like to read an LE_Point into a Point, but of course this is unsafe. 
However, when writing a value to a document these coercions save us some work 
inserting constructors. 

Of course, since, unlike Schema itself, our coercion mechanism is structural, 
we can employ this capability in other ways. For instance, when writing a value 
to a document, we can use the fact that Nothing is the leftmost injection into 
the Maybe a type to omit optional elements. 



3.5 Conclusion 

Let us summarize the main points of this case study. 

We demonstrated first by example that UUXML-translated datatypes are 
overwhelmingly complex and redundant. To address complaints that this prob- 
lem stems merely from a bad choice of representation, we enumerated some of 
UUXML’s design criteria, and explained why they necessitate that representa- 
tion. We also suggested why other translations and type-safe embeddings might 
succumb to the same problem. Finally, we described how to exploit our iso in- 
ference mechanism to address this problem, and how coercion inference can also 
be used to simplify the treatment of object language features such as subtyping 
and optional values which the metalanguage does not inherently support. 



4 Generic Isomorphisms 

In this section, we describe how to automatically generate isomorphisms between 
pairs of datatypes. Our implementation platform is Generic Haskell, and in par- 
ticular we use dependency-style GH [17]. This section assumes a basic familiarity 
with Generic Haskell, but the definitions are all remarkably simple. 

We address the problem in four parts, treating first the product and sum 
isos in isolation, then showing how to merge those implementations. Finally, we 
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describe a simple modification of the resulting program which implements the 
non- invertible coercions. 

In each case, we build the requisite morphism by reducing a value v :: t to a 
value of a universal data type u = reduce{ |t|} v :: Univ. The type Univ plays the 
role of a normal form from which we can then expand to a value expand { |t'|} w::t' 
of the desired type, where t ^ t' canonically, or t = t' for the isos. 



4.1 Handling Products 

We define the functions reduce{ |t[} and expand {|t|} which infer the isomorphisms 
expressing associativity and identities of binary products: 

a Unit = a Unit :*: a = a (a b) :*: c = a (b c) 

We assume a set of base types, which may include integers, booleans, strings 
and so on. For brevity’s sake, we mention only integers in our code. 

data UBase = UInt Int | UBool Bool | UString String | • • • 

The following two functions merely serve to convert back and forth between the 
larger world and our little universe of base types. 



type ReduceBase{[*]} t 

reducebase {|t :: k|} 
reducebase{ 1 1 nt |} i 

type Expand Base{[*]} t 

expandbase {|t :: k|} 
expandbase{\ I nt j} ( UInt i) 



t — > UBase 
ReduceBase{[/i]} t 
UInt i 
UBase — > t 
Expan dBase{[/i]} t 
i 



Now, as Schemers well know, if we ignore the types and remove all occurrences 
of Unit, a right-associated tuple is simply a cons-list, hence our representation, 
Univ is defined: 



type Univ = [UBase] . 

Our implementation of reduce {|t|} depends on an auxiliary function red{|t[}, which 
accepts a value of t along with an accumulating argument of type Univ; it returns 
the normal form of the t- value with respect to the laws above. The role of 
reduce{ ]t[} is just to prime red{|t[} with an empty list. 

type Red{[*]} t 
red{ |t :: k|} 
red {| Int]} i u 
red{|Unit[} () 
red{ |a :*: b|} (a :*: b) 

reduce{ |t :: *|} 
reduce { |t|} x 



= t — > Univ — * Univ 
:: Red{[«]} t 

= reducebase{ 1 1 nt [} i : u 

= id 

— r-ecZ {| a |} a o red{ |bj} b 

:: t — > Univ 

= red^t\}x[] 
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Here is an example of reduce{ ]t|} in action: 

reduce{ | ( (I nt, (Int, Int)), ())|} ((2, (3,4)), ()) = [ UInt 2, UInt 3, UInt 4] . 

Function expand {jt|} takes a value of the universal data type, and returns a value 
of type t. It depends on the generic function Zen{|t|}, which computes the length 
of a product, that is, the number of components of a tuple: 



type Len{[*]} t 
len{ |t :: k|} 
Zen{|lnt|} 
Zen{|Unit|} 
len{ ja :*: b|} 



Int 

Len{[/i]} t 
1 
0 

len{ | a Q- + fen{|b|} . 



Observe that Unit is assigned length zero. 

Now we can write expand{ |t|}; like reduce{ |t|}, it is defined in terms of a 
helper function exp{|t|}, this time in a dual fashion with the ‘unparsed’ remainder 
appearing as output. 



type Exp{[*]} t 
exp{ |t :: k[} :: 

exp{|lnt|} ( u : us) = 

exp {| Int j} [] 

exp{|Unit|} us = 

exp{ |a :*: b|} us = 

type Expand{[*]} t = 
expand{ |t :: k|} :: 

expand { |tj} u = 



Univ — > (t, Univ) 

Exp{[ K ]} t 

(expandbase{ |lnt|} u,us) 
error "exp" 

(Unit, us) 

let («, us') = exp{|a|} us 
(■ v,us ") = exp{|b[} us' 
in (u :*: v, us") 

Univ — + t 
Expa nd {[«]} t 
case exp{|t|} u of 
(«,[]) — « 

(i/,_) — > error "expand 



In the last case, we compute the lengths of each factor of the product to 
determine how many values to project there - remember that a need not be a 
base type. This information tells us how to split the list between recursive calls. 
Here is an example of expand{ |t|} in action: 



expand{ |((lnt, (Int, Int)), ())|} [ UInt 2, UInt 3, UInt 4] = ((2, (3, 4)), ()) 



4.2 Handling Sums 

We now turn to the treatment of associativity and identity laws for sums: 

a :+: Zero = a Zero :+: a = a (a :+: b) :+: c = a :+: (b :+: c) . 
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We can implement Zero as an abstract type with no (visible) constructors: 

data Zero . 

As we will be handling sums alone in this section, we redefine the universal type 
as a right-associated sum of values: 

data Univ = Ulnl UBase | Ulnr Univ . 

Note that this datatype Univ is isomorphic to: 

data Univ = Uln Int UBase . 

We prefer the latter as it simplifies some definitions. We also add a second integer 
field: 

data Univ = Uln Int Int UBase . 

If u = Uln r a b then we shall call a the arity of u - it remembers the “width” 
of the sum value we reduced; we call r the rank of u - it denotes a zero-indexed 
position within the arity, the choice which was made. We guarantee, then, that 
0 ^ r < a. Of course, unlike Unit, Zero has no observable values so there is no 
representation for it in Univ. 

UBase, reducebase{ |t|} and expandbase{ |t|} are defined as before. 

This time around, function reduce{ |t|} represents values by ignoring choices 
against Zero and right-associating sums. The examples below show some example 
inputs and how they are reduced (we write I for Int and u for UInt i ): 

i :: I i— > Uln 0 1 u 

Ini i :: I :+: Zero i— > Uln 0 1m 

Inr i :: Zero :+: I i— > Uln 0 1m 

Ini i :: I :+: I i— > Uln 0 2m 

Inr i :: I :+: I i— > Uln 12m 

Ini ( Ini i) :: (I :+: I) :+: I e-> Uln 0 3m 

Ini ( Inr i ) :: (I :+: I) :+: I i— > Uln 13m 

Inr i :: (I :+: I) :+: I i— > Uln 2 3m 

Function reduce{ |t j} depends on the generic value arity { |t|}, which counts the 

number of choices in a sum. 

type Arity{[*]} t = Int 

arity{ ]t::fc|} :: Arity{[«]} t 

arity { |lnt|} = 1 

arity { | Zero |} = 0 

arity{ |a :+: b|} = arity{ ]a[} + arity { |b|} 
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Now we can define reduce flt}: 



type Reduce{[*]} t 
reduce { jt :: «[} 
reduce{ 1 1 nt [} i 
reduce {|Zero|} _ 
reduce^ a :+: b|} ( Ini x ) 
where Uln r a u 
reduce{ ja :+: b|} ( Inr x ) 
where Uln r a u 



t — > Univ 
Reduce{[/i]} t 

Uln 0 1 (reducebase { 1 1 nt [} i) 

_L 

Uln r (a + arity{ | b|}) u 
reduce {| a |} x 

Uln (r + arity{ ]a [}) (arity{ |aj} + a) u 
reduce { jb|} a; . 



This treats base types as unary sums, and computes the rank of a value by 
examining the arities of each summand, effectively ‘flattening’ the sum. 

The function expand { |t|} is defined as follows: 



type Expan d{[*]} t = 

expan d {|t :: k|} :: 

expand{ 1 1 n t } ( Uln 0 1m) = 

expan d{| Zero)} _ = 

expand{ |a :+: b j} ( Uln r a u) 

| a = aa + ab A r < aa = 
| a = aa + ab = 

| otherwise = 

■where (aa, ab) 



I Jr v -4 t 

Expand{[«]} t 
expandbase {| I nt|} i 
error "expand" 

Ini ( expand {| a [} (Uln r (a — ab) u)) 

Inr ( expand {] b|} (Uln (r — aa) (a — aa) u)) 
error "expand" 

(arity^afy, anty{jb|}) . 



The logic in the last case checks that the universal value ‘fits’ in the sum type 
a :+: b, and injects it into the appropriate summand by comparing the value’s 
rank with the arity of a, being sure to adjust the rank and arity on recursive 
calls. 



4.3 Sums and Products Together 

It may seem that a difficulty in handling sums and products simultaneously 
arises in designing the type Univ, as a naive amalgamation of the sum Univ (call 
it UnivS) and the product Univ (call it UnivP) permits multiple representations of 
values identified by the canonical isomorphism relation. However, since the rules 
of our isomorphism theory do not interact - in particular, we do not account 
for any sort of distributivity - , a simpler solution exists: we can nest our two 
representations and add the top layer as a new base type. For example, we can 
use UnivP in place of UBase in UnivS and add a new constructor to UBase to 
encapsulate sums. 

data UnivS = Uln Integer UnivP 
data UnivP = UNil \ UCons UBase UnivP 
data UBase = UInt Int | USum UnivS 

We omit the details, as the changes to our code examples are straightforward. 
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4.4 Handling Coercions 

The reader may already have noticed that our expansion functions impose some 
unnecessary limitations. In particular: 

— when we expand to a product, we require that the length of our universal 
value equals the number computed by £en{|t|}, and 

— when we expand to a sum, we require that the arity of our universal value 
equals the number computed by arity { jt|}. 

If we lift these restrictions, replacing equality by inequality, we can project a 
prefix of a universal value onto a tuple of smaller length, and inject a universal 
value into a choice of larger arity. The modified definitions are shown below for 
products: 



expand{ |tj} u = case ea;p{|t|} u of 
0,_) -> v 

and for sums: 



expand{ ja :+: b[} ( Uln r a u) 

| a ^ aa + ab A r < aa = 
| a ^ aa + ab = 

| otherwise = 

where (aa,ab) = 



Ini ( expand {| a |} ( Uln r (a — ab) m)) 

Inr {expand {|b|} ( Uln (r — aa) (a — aa) u)) 
error "expand" 

(arity arity §b$) . 



These changes implement our canonical coercions, the first projection of a prod- 
uct and left injection of a sum: 



a :*: b ^ a 



a ^ a :+: b 



Ad Hoc Coercions. Schema (and most other conventional languages) also de- 
fines a subtyping relation between primitive types. For example, int is a subtype 
of integer which is a subtype of decimal. We can easily model this by (adding 
some more base types and) modifying the functions which convert base types. 



expandbaset\Vdec\ma\^ ( UDecimal x) 
erpand&asejl Decimal]} ( UInteger x) 
expandbase { | Decimal} ( UInt x) 
expandbase^ Integer)} ( UInteger x) 
expandbase { | Integer]} ( UInt x) 
expandbase{ 1 1 nt [} ( UInt x) 



x 

integer2dec x 
int2dec x 
x 

int2integer x 
x 



Such primitive coercions are easy to handle, but without due care are likely to 
break the coherence properties of inference, so that the inferred coercion depends 
on operational details of the inference algorithm. 
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5 Conclusions 

In this paper, we have described a simple, powerful and general mechanism 
for automatically inferring a well-behaved class of isomorphisms, and demon- 
strated how it addresses some usability problems stemming from the complexity 
of our Haskell-XML Schema data binding, UUXML. Our mechanism leverages 
the power of an existing tool, Generic Haskell, and the established and growing 
theory of type isomorphisms. 

We believe that both the general idea of exploiting isomorphisms and our 
implementation technique have application beyond UUXML. For example, when 
libraries written by distinct developers are used in the same application, they 
often include different representations of what amounts to the same datatype. 
When passing data from one library to the other the data must be converted to 
conform to each library’s internal conventions. Our technique could be used to 
simplify this conversion task; to make this sort of application practical, though, 
iso inference should probably be integrated with type inference, and the class 
of isos inferred should be enlarged. We discuss such possibilities for future work 
below. 

5.1 Related Work 

Besides UUXML, we have already mentioned the HaXML [37] and WASH [35] 
XML data bindings for Haskell. The Model Schema Language semantics [4] is 
now superseded by newer work [32] ; we are investigating how to adapt our encod- 
ing to the more recent treatment. Special-purpose languages, such as XSLT [36], 
XDuce [12], Yatl [6], XMA [22,31], SXSLT [13] and Xtatic [9], take a different 
approach to XML problems. 

In computer science, the use of type isomorphisms seem to have been popu- 
larized first by Rittri who demonstrated their value in software retrieval tasks, 
such as searching a software library for functions matching a query type [29]. 
Since then the area has ballooned; good places to start on the theory of type 
isomorphisms is Di Cosmo’s book [8] and the paper by Bruce et al. [5]. More 
recent work has focused on linear type isomorphisms [2,33,30,20]. 

In category theory, Mac Lane initiated the study of coherence in a seminal 
paper [18]; his book [19] treats the case for monoidal categories. Beylin and Dyb- 
jer’s use [3] of Mac Lane’s coherence theorem influenced our technique here. The 
strong regularity condition is sufficient for ensuring that an algebraic theory is 
cartesian ; cartesian monads have been used by Leinster [15, 14] and Hermida [10] 
to formalize the notion of generalized multicategory, which generalizes a usual 
category by imposing an algebraic theory on the objects, and letting the domain 
of an arrow be a term of that theory. 

5.2 Future Work 

Schema Matching. In areas like database management and electronic com- 
merce, the plethora of data representation standards - formally, ‘schemas’ - used 




Inferring Type Isomorphisms Generically 



49 



to transmit and store data can hinder reuse and data exchange. To deal with 
this growing problem, ‘schema matching’, the problem of how to construct a 
mapping between elements of two schemas, has become an active research area. 
Because the size, complexity and number of schemas is only increasing, finding 
ways to accurately and efficiently automate this task has become more and more 
important; see Rahm and Bernstein [28] for a survey of approaches. 

We believe that our approach, which exploits not only the syntax but seman- 
tics of types, could provide new insights into schema matching. In particular, the 
notion of canonical (iso)morphism could help clarify when a mapping’s semantics 
is forced entirely by structural considerations, and when additional information 
(linguistic, descriptive, etc.) is provably required to disambiguate a mapping. 

Implicit Coercions. Thatte introduced a declaration construct for introducing 
user-defined, implicit conversions between types [34], using, like us, an equational 
theory on types. Thatte also presents a principal type inference algorithm for 
his language, which requires that the equational theory is unitary , that is, every 
unifiable pair of types has a unique most general unifier. To ensure theories 
be unitary, Thatte demands they be finite and acyclic, and uses a syntactic 
condition related to, but different from, strong regularity to ensure finiteness. 
In Thatte ’s system, coherence seems to hold if and only if the user-supplied 
conversions are true inverses. 

The relationship between Tlratte’s system and ours requires further inves- 
tigation. In some ways Tlratte’s system is more liberal, allowing for example 
distributive theories. On the other hand, the unitariness requirement rules out 
associative theories, which are infinitary. The acyclicity condition also rules out 
commutative theories, which are not strongly regular, but also the currying iso, 
which is. Another difference between Thatte’s system and ours is that his catches 
errors at compile-time, while the implementation we presented here does so at 
run-time. A final difference is that, although the finite acyclicity condition is 
decidable, the requirement that conversions be invertible is not; consequently, 
users may introduce declarations which break the coherence property (produce 
ambiguous programs). In our system, any user-defined conversions are obtained 
structurally, as datatype isos from datatype declarations, which cannot fail to 
be canonical; hence it is not possible to break coherence. 

The Generic Haskell Implementation. We see several ways to improve our 
current implementation of iso inference. 

— We would like to detect inference errors statically rather than dynamically 
(see below). 

— Inferring more isomorphisms (such as the linear currying isos) and more 
powerful kinds of isomorphisms (such as commutativity of products and 
sums, and distributivity of one over the other) is also attractive. 

— Currently, adding new ad hoc coercions requires editing the source code; 
since such coercions typically depend on the domain of application, a better 
approach would be to somehow parametrize the code by them. 
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— We could exploit the fact that Generic Haskell allows to define type cases on 
the — > type constructor: instead of providing two generic functions reduce{ |t|} 
and expand { |t|}, we would provide only a single generic function: 

coerce{|t — » t'l} = expand{ |t'|} o reduce{ ]t|} . 

— The fact that the unique witness property does not readily transfer from 
type schemes to types might be circumvented by inferring first-class poly- 
morphic functions which can then be instantiated at suitable types. Generic 
Haskell does not currently allow to do so, but if we could write expressions 
like coerce{|Va b.(a,b) — ■> (b, a)|} we could infer all canonical isos, without 
restriction, and perhaps handle examples like Date_NL and Date_US from 
section 1. 



Inference Failure. Because our implementation depends on the “universal” 
type Univ, failure occurs dynamically and a message helpful for pinpointing the 
error’s source is printed. This situation is unsatisfactory, though, since every 
invocation of the expand and reduce functions together mentions the types in- 
volved; in principle, we could detect failures statically, thus increasing program 
reliability. 

Such early detection could also enable new optimizations. For example, if 
the types involved are not only isomorphic but equal, then the conversion is the 
identity and a compiler could omit it altogether. But even if the types are only 
isomorphic, the reduction might not unreasonably be done at compile-time, as 
our isos are all known to be terminating; this just amounts to adjusting the data 
representation ‘at one end’ or the other to match exactly. 

We have investigated, but not tested, an approach for static failure detection 
based on an extension of Generic Haskell’s type-indexed datatypes [11]. The idea 
is to introduce a type-indexed datatype NF{[t]} which denotes the normal form 
of type t w.r.t. to the iso theory, and then reformulate our functions so that they 
are assigned types: 

reduce{ |t[} :: t — > N F {[t]} 

expand{ |t|} :: N F {[t]} — > t . 

For example, considering only products, the type NF{[t]} could be defined as 
follows: 



type NF{[t]} 
data Norm{[Unit]} t 
data Normjja :*: b]} t 
data Norm{[lnt]} t 



Norm{[t]} Unit 

NUnit t 

NProd (a :*: (b :*: t)) 
NBase (Int :*: t) . 



This would give the GH compiler enough information to reject bad conversion 
at compile-time. 
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Unfortunately, the semantics of GH’s type-indexed datatypes is too “gener- 
ative” for this approach to work. The problem is apparent if we try to compile 
the expression: 

expand{ 1 1 nt |} o reduce^ Int, ())|} . 

GH flags this as a type error, because it treats N F{[l nt]} and NF{[(lnt, ())]} as 
distinct (unequal), though structurally identical, datatypes. 

A possible solution to this issue may be a recently considered GH extension 
called type-indexed types (as opposed to type-indexed datatypes ). If NF{[f]} is 
implemented as a type-indexed type, then, like Haskell’s type synonyms, struc- 
turally identical instances like the ones above will actually be forced to be equal, 
and the expression above should compile. However, type-indexed types - as cur- 
rently envisioned - also share the limitations of Haskell’s type synonyms w.r.t. 
recursion; a type-indexed type like NF{[List Int]} is likely to cause the compiler 
to loop as it tries to expand recursive occurrences while traversing the datatype 
body. Nevertheless, of the several approaches we have considered to address- 
ing the problem of static error detection, type-indexed types seems the most 
promising. 
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Abstract. In the context of a formal programming methodology and verifica- 
tion system for ownership-based invariants in object-oriented programs, & friend- 
ship system is defined. Friendship is a flexible protocol that allows invariants 
expressed over shared state. Such invariants are more expressive than those al- 
lowed in exisiting ownership type systems because they link objects that are not 
in the same ownership domain. Friendship permits the modular verification of 
cooperating classes. This paper defines friendship, sketches a soundness proof, 
and provides several realistic examples. 



1 Introduction 

Whether they are implicit or explicit, object invariants are an important part of object- 
oriented programming. An object’s invariant is, in general, a healthiness guarantee that 
the object is in a “good” state, i.e., a valid state for calling methods on it. 

For example, in a base class library for collection types, certain method calls may 
be made on an enumerator only if the underlying collection has not been modified since 
the enumerator was created. Other examples are that a graph remains acyclic or that an 
array stays sorted. 

Various proposals have been made on how object invariants can be formally ex- 
pressed and on different mechanisms for either guaranteeing that such invariants hold 
[LN02, LG86, Miil02] or at least dynamically recognizing moments in execution where 
they fail to hold [BS03, CL02, Mey97]. For the most part, these systems require some 
kind of partitioning of heap objects so that an object’s invariant depends only on those 
objects over which it has direct control. This is intuitive, since it is risky for one data 
structure to depend on another over which it has no control. However, systems such 
as ownership types [CNP01, ClaOl, BLS03, Miil02] are inflexible in that they demand 
object graphs to be hierarchically partitionable so that the dependencies induced by ob- 
ject invariants do not cross ownership boundaries. There are many situations where an 
object depends on another object but cannot reasonably own it. 

We relax these restrictions with a new methodology; we define a protocol by which 
a granting class can give privileges to another friend class that allows the invariant 
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ITR-0326540, the Office of Naval Research under grant N00014-01-1-0837, and Microsoft 
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in the friend class to depend on fields in the granting class. As in real life, friendship 
demands the active cooperation of both parties. A friend class can publish restrictions 
on field updates of the granting class. The granting class must be willing to operate 
within these restrictions. In return, each instance of the friend class must register itself 
with the instance of the granting class that it is dependent on. And as in real life, the 
quality of the friendship depends on how onerous its burdens are. We believe our system 
imposes a minimal set of constraints on the participating parties. 

Our method builds on ownership-based invariants [BDF + 03a], formalized using an 
auxiliary field owner [LM04], We refer to the combination of [BDF + 03a] and [LM04] 
as the Boogie methodology. An on-going project at Microsoft Research named “Boo- 
gie” is building a tool based on that methodology. To make this paper self-contained, we 
review the relevant features of the object invariant system from that work in Section 2. 

Section 3 presents a representative example of an instance of a granting class per- 
forming a field update that could violate the invariant of an instance of a friend class. 
We describe the required proof obligations for the granting object to perform the field 
update without violating the invariants of its friends or the object invariant system. In 
Section 3.1, we describe how a granting class declares which classes are its friends and 
how granting objects track friends that are dependent upon it. Section 3.2 describes how 
a granting object sees an abstraction of the invariants of its friends, rather than the full 
details. In Section 3.3, we define the obligations incumbent upon the friend class for no- 
tifying granting objects of the dependence. Section 3.4 summarizes all of the features 
of our method. 

Section 4 provides a sketch of a soundness argument. Section 5 describes two ex- 
tensions. The first, in Section 5.1, presents a convenient methodology that shows how 
reasoning about dependents can be linked to the code of the granting class. In Sec- 
tion 5.2, we describe a syntactic means for transmitting information after a field update 
back to the granting object from a friend object. We give several challenging examples 
in Section 6. Section 7 reviews related work and Section 8 summarizes our contribution 
and points out future work. 

We assume some familiarity with the principles of object-oriented programming 
and the basics of assertions (pre- and post-conditions, modifies clause, and invariants) 
as well as their use in the static modular verification of sequential object-oriented pro- 
grams. However, we do not presuppose any particular verification technology. 

For simplicity, we omit any description of subtyping. The full treatment is de- 
scribed in a forthcoming technical report; it follows the same pattern as the Boogie 
work [BDF + 03a]. A companion paper [NB04] gives a rigorous proof of soundness in 
a semantic model. The concrete syntax that we use is not definitive and illustrates one 
particular encoding. 

2 Using Auxiliary Fields for Ownership-Based Invariants: 

A Review 

Using the contrived example code in Figure 1 we review the Boogie approach to invari- 
ants and ownership. In our illustrative object-oriented language, class types are implic- 
itly reference types; we use the term “object” to mean object reference. 
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class Set { 
fst : Node := null; 
insert(x : int) 

{ 

t : Node := new Node{x)\ 

“code to insert t” 

} 

remove(x : int) 

{“delete first node with val x’’ } 
map(g : Fun) 

{“apply g to all elements; remove duplicates’"} 

} 



class Node { 
val : int 
next : Node 
Node(x : int) 

{ val := x; next := null} 

} 

class Fun { 

apply (x : int) : int 
{ return x mod 7; } 

} 



Fig. 1. A set of integers is represented by a linked list, without duplicate values, rooted at fst . 
Method insert, adds an element if not already present. Method map(g) updates the set to be its 
image through g. apply . Class Node has only a constructor; nodes are manipulated in Set . 



An instance of class Set maintains an integer set represented by a sequence without 
duplicates, so that remove(x) can be implemented using a linear search that terminates 
as soon as x is found. The specification of class Set could include invariant 

Invset '■ fst. is the root of an acyclic sequence without duplicate values. 

We denote the invariant for a class T by Invp • Note that since the invariant mentions 
instance fields, it is parameterized by an instance of type T . We write Invr{o) where 
o is an object of type T when we want to make explicit the value of an invariant for a 
particular instance. 

An object invariant is typically conceived as a pre- and post-condition for every 
method of the class. For example, if remove(x) is invoked in a state where there are 
duplicates, it may fail to establish the intended postcondition that x is not in the set. 
Constructors establish invariants. 

The method map takes the function supplied as an argument and, abstractly, maps 
the function over the set to yield an updated set. Suppose it is implemented by first up- 
dating all of the values in place and only after that removing duplicates to re-establish 
the invariant. One difficulty in maintaining object invariants is the possibility of reen- 
trant calls: If an object g has access to the instance s on which s.map(g) is invoked, 
then within the resulting call to g. apply there could be an invocation of s. remove . But 
s at that point is in an inconsistent state - i.e., a state in which Invset (s) does not hold. 
It is true that by considering the body of apply as given in Figure 1 we can rule out this 
possibility. But for modular reasoning about Set we would only have a specification 
for apply - quite likely an incomplete one. (Also, Fun could have been written as an 
interface; or apply can be overridden in a subclass of Fun .) 

A sound way to prevent the problem of re-entrance is for the invariant to be an 
explicit precondition and postcondition for every method: apply would be required to 
establish Invs e t{s) before invoking s. remove , and it cannot do so in our scenario. But 
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this solution violates the principle of information hiding: Using Node and maintaining 
Invset are both decisions that might be changed in a revision of Set (or in a subclass). 
Indeed, we might want the field fst to be private to Set whereas the precondition of a 
public method should mention only visible fields. 

It is possible to maintain proper encapsulation by making it the responsibility of Set 
to ensure that its invariant hold at every “observable state”, not only at the beginning 
and end of every method but also before any “out” call is made from within a method. 
In the example, Set would have to establish Invs e t{s ) within map before each call to 
apply . Though frequently proposed, this solution is overly restrictive. For instance, it 
would disallow the sketched implementation of map in which removal of duplicates is 
performed only after all the calls to apply . In a well structured program with hierarchi- 
cal abstractions there are many calls “out” from an encapsulation unit, most of which 
do not lead to reentrant callbacks. 

Programmers often guard against reentrant calls using a “call in progress” field; 
this field can be explicitly mentioned in method specifications. In some respects this is 
similar to a lock bit for mutual exclusion in a concurrent setting. Disallowing a call to 
remove while a call to map is in progress can be seen as a protocol and it can be useful 
to specify allowed sequences of method invocations [DF01, DF03]. 

We wish to allow reentrant calls. They are useful, for example, in the ubiquitous 
Subject- View pattern where a reentrant callback is used by a View to inspect the state 
of its Subject. On the other hand, general machinery for call protocols seems onerous 
for dealing with object invariants in sequential programs. Moreover this is complicated 
by subclassing: a method added in a subclass has no superclass specification to be held 
to. 

Boogie associates a boolean field inv with the object invariant. This association is 
realized in the following program invariant, a condition that holds in every state. (That 
is, at every control point in the program text.) 

(Vo • o.inv => Itivt{o) where T = type(o) ) (1) 

Here and throughout the paper, quantification ranges over objects allocated in the cur- 
rent state. The dynamic (allocated) class of o is written type(o) . Also, logical con- 
nectives (such as conjunction) should not be assumed to be commutative since we often 
write expressions such as o / null A o.f = . . . where the right-hand conjunct has a 
meaning only if the left-hand side is true. 

As part of the methodology to ensure that (1) is in fact a program invariant, we 
stipulate that the auxiliary field inv may only be used in specifications and in special 
statements pack and unpack . If the methods of Set all require inv as precondition, 
then apply is prevented from invoking s. remove as in the first solution above - but 
without exposing Invset in a public precondition. Nevertheless, the body of remove 
can be verified under precondition Invs e t owing to precondition inv and program 
invariant ( 1 ). 

The special statements pack and unpack enforce a discipline to ensure that 
(1) holds in every state. Packing an object sets inv to true; it requires that the ob- 
ject’s invariant holds. Unpacking an object sets inv to false. Since an update to some 
field o.f could falsify the invariant of o, we require that each update be preceded by 
assert -> o.inv . 
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The details are deferred so we can turn attention to another issue raised by the 
example, namely representation exposure. The nodes reached from Set .fst are intended 
to comprise an encapsulated data structure, but even if fst is declared private there is 
a risk that node references are leaked, e.g., some client of a set s could change the 
value in a node and thereby falsify the invariant. Representation exposure due to shared 
objects has received considerable attention [LN02, BN02a], including ownership type 
systems [Miil02, CD02, BLS03] and Separation Logic [OYR04], In large part these 
works are motivated by a notion of ownership: the Nodes reached from s.fst , on which 
Inv Se t{s) depends, are owned by that instance s and should not be accessed except 
by s . This ensures that the invariant of s is maintained so long as methods of Set 
maintain it. 

With the exception of Separation Logic, which does not yet deal with object-oriented 
programs, the cited works suffer from inflexibility due to the conservatism necessary for 
static enforcement of alias confinement. For example, type systems have difficulty with 
transferring ownership. However, transfer is necessary in many real-world examples 
and state encapsulation does not necessarily entail a fixed ownership relation. (This is 
emphasized in [OYR04, BN03].) 

A more flexible representation of ownership can be achieved using auxiliary fields 
owner and comm in the way proposed by Barnett et al. [BDF + 03a] and refined by 
Leino and Muller [LM04], The field owner , of type Object , designates the owner, 
or null if there is no owner. The boolean field comm designates whether the object 
is currently committed to its owner: if it is true, then its invariant holds and its owner 
is depending on having sole access for modifying it. The latter is true whenever the 
owner, o, sets its own inv bit, o.inv . Since o ’s invariant may depend on the objects 
that it owns, it cannot guarantee its invariant unless no other object can update any 
object p where p. owner = o, or where p is a transitively owned object. There are 
two associated program invariants. The first is that o.inv implies that every object p 
owned by o is committed. 

(Vo* o.inv => (Vp • p. owner = o => p.comm)) (2) 

The second ties commitment to invariants: 

(Vo* o.comm => o.inv ) (3) 

The special fields inv , comm , owner are allowed in pre- and post-conditions; only 
owner is allowed to occur in object invariants. A consequence is that in a state where 
o transitively owns p, we have o.inv => p.comm. 

The point of ownership is to constrain the dependence of invariants and to encapsu- 
late the objects on which an invariant Invr depends so that it cannot be falsified except 
by methods of T . 

Definition 1 (admissible object invariant). An admissible object invariant Inv t {o') 
is one such that in any state, if Invr{o) depends on some object field p.f in the sense 
that update of p.f can falsify Invr{o), then either 

— p = o (this means that this./ is in the formula for Invx ); or 

- p is transitively owned by o . 
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Moreover, Invp is not falsifiable by creation of new objects. 

Transitive ownership is inductively defined to mean that either p. owner = o or that 
p. owner is transitively owned by o. 

Creation of a new object can falsify a predicate by extending the range of a quan- 
tification. For example, the predicate (Vp • p = o ) asserts that o is the only object 
and is falsifiable by creation of new objects. It would be difficult to maintain (1) for 
this predicate without impractical restrictions on new . A quantification ranging over 
owned objects, i.e., of the form 

(Vp | p. owner = o • ...) 

is not falsifiable by creation of new objects, because the owner field of a new object is 

null 1 . 

Remarkably, / in the Definition is allowed to be public, though for information 
hiding it is often best for it to be private or protected. The ownership discipline makes it 
impossible for an object to update a public field of another object in a way that violates 
invariants. But no restriction is imposed on reading. 

Aside 1 The methodology handles situations where an object owns others that it does 
not directly reference, e.g., nodes in a linked list. But a common situation is direct 
reference like field fst. To cater for this, it is possible to introduce a syntactic marker 
rep on a field, to designate that its value is owned. It is not difficult to devise annotation 
rules to maintain the associated program invariant 

(Vo • o.inv A of f null => of .owner = o ) 

for every rep field f . On the other hand, one can simply include “this./ = null V 
this./ . owner = o" as a conjunct of the invariant, so in this paper we omit this feature. 
A similar feature is to mark a field f as peer, to maintain the invariant this./ = 
null V this ./. owner = this . owner [LM04], Again, it is useful but does not solve 
the problems addressed in this paper and is subsumed under our proposal. 

The program invariants hold in every state - loosely put, “at every semicolon” - 
provided that field updates to the field / , with expressions E and D , are annotated as 

assert -> E.inv ; (4) 

Ef~D- 

and the special fields inv , comm , and owner are updated only by the special state- 
ments defined below. Most important are the special statements for inv and comm 2 . 

1 Leino and Muller [LM04] intended, but omitted to say in their definition of admissibility, 
that quantifications over objects must have this form. We prefer a semantic formulation, for 
calculational flexibility and because it highlights what is needed in the soundness proof for the 
case of new . 

2 Note that the “ foreach ” statement in unpack updates the auxiliary field comm of an 

unbounded number of objects. An equivalent expression, more in the flavor of a specifica- 
tion statement in which the field comm is viewed as an array indexed by objects, is this: 
change comm such that ( Vp • p.comm = (old {p.comm) A p. owner E )). 
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The statement unpack E makes object E susceptible to update; pack E does the 
reverse, re-asserting the invariant for E . 

unpack E = assert E ^ null A E.inv A -> E.comm ; 

E.inv := false; 

foreach p such that p. owner = E do p.comm := false; 

pack E = assert E ^ null A -> E.inv A Invr(E) 

A (Vp • p. owner = E => -> p.comm A p.inv ); 

foreach p such that p. owner = E do p.comm := true; 

E.inv := true; 

Proofs that pack and unpack maintain the program invariants (1 ), (2), and (3) can be 
found in [BDF + 03a] and [NB04]. Let us consider how (4) maintains (1). An admissible 
invariant for an object o depends only on objects owned by o and thus can only be 
falsified by update of the field of such an object. But an update of p.f is only allowed 
if -^p.inv . If p is owned by o then -> p.inv can only be achieved by unpacking p, 
which can only be done if p is not committed. But to un-commit p requires unpacking 
o - and then, since -i o.inv , there is no requirement for Invr(o) to hold. 

The special statements pack and unpack effectively impose a hierarchical dis- 
cipline of ownership, consistent with the dependence of invariants on transitively 
owned objects. Because the discipline is imposed in terms of auxiliary state and 
verification conditions rather than as an invariant enforced by a static typing sys- 
tem [Miil02, ClaOl, BLS03, BN02a], the temporary violations permitted by pack and 
unpack offer great flexibility. 

Every constructor begins implicitly with initialization 

inv, comm , owner := false, false, null. 

which means that constructors do not need to unpack before assigning to fields. 

The last of the special statements is used to update owner . 

set-owner E to D = assert E ^ null A -> E.inv A (D = null V -i D.inv ); 

E. owner := D\ 



At first glance it might appear that the precondition E. owner = null V -<E. owner. inv 
is needed as well, but for non-null E. owner , we get ->E .owner .inv from -i E.inv by 
the program invariants. 

A cycle of ownership can be made using set-owner, but the precondition for 
pack cannot be established for an object in such a cycle. 

One of the strengths of this approach to ownership is that set-owner can be used 
to transfer ownership as well as to initialize it (see the example in Section 6.3). Another 
strength is the way invariants may be declared at every level of an inheritance chain; 
we have simplified those few parts of the methodology which are concerned with sub- 
classing. The reader may refer to the previous papers [BDF + 03a, LM04] for more 
discussion. 
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3 The Problem: Objects without Borders 



The Boogie methodology is adequate for the maintenance of ownership-based invari- 
ants. Such invariants are over objects within a single domain, i.e., encapsulated by the 
ownership boundary. Our contribution in this paper, summarized in Section 3.4, is to go 
beyond ownership to cooperating objects, ones whose invariants “cross the border’’. 

We use the code in Fig. 2 as our running example. It uses standard specification 
constructs for method pre-conditions and post-conditions and class invariants. The in- 



class Master { 
time : int; 

invariant 0 < time; 

Master () 

ensures inv A ~<comm; 

{ time := 0; pack this; } 

Tick(n : int) 

requires inv A -icomm A 0 < n; 

modifies time; 

ensures time > old (time); 

{ 

unpack this; 

time := time + n; 

pack this: 

} 

} 



class Clock { 
t : int; 

m : Master; 

invariant m ^ null A 0 < t < m.time; 
Clock(mast : Master) 
requires mast ^ null A mast. inv; 
ensures inv A -<comm; 

{ m := mast; t := 0; pack this; } 

SyncQ 

requires inv A -icomm; 

modifies t; 
ensures t = m.time; 

{ unpack this; t := m.time; pack this; } 

} 



Fig. 2. A simple system for clocks synchronized with a master clock. Inv clock (this) depends 
on this .m.time but does not own this.m. 



variant 0 < time in class Master abbreviates 0 < this. time . Thus, by our notational 
convention, InvMaster(o) denotes 0 < o.time . According to the rules for admissible 
invariants in Section 2, InVMaster is allowed. 

The constructor for Master exemplifies the usual pattern for constructors: it first 
initializes the fields in order to establish the invariant and then uses pack to set the inv 
bit. Methods that update state typically first execute unpack to turn off the inv bit 
and then are free to modify field values. Before they return, they use pack once their 
invariant has been reestablished. 

The predicate Invciock is not an admissible invariant: it depends on m.time , but 
a clock does not own its master. Otherwise a master could not be associated with more 
than one clock. While it might be reasonable to let the master own the clocks that point 
to it, we wish to address situations where this ownership relation would not be suitable. 
More to the point, such a solution would only allow InVMaster to depend on the clocks 
whereas we want Invciock to depend on the master. 
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Although Invciock is n °t admissible according to Definition 1, the update of time 
in Tick increases the value of time , which cannot falsify Invciock ■ The problem that 
our methodology solves is to allow non-ownership dependence in a situation like this, 
i.e., to support modular reasoning about the cooperative relationship wherein Tick does 
not violate Invciock ■ 

However, while Tick is a safe method in relation to Invciock , we want to preclude 
the class Master from defining a method Reset : 

Reset() 

requires inv : 
modifies time', 

{ unpack this; time := 0; pack this; } 

This is easily shown correct in terms of Inv Master , but o. Reset can falsify the invariant 
of any clock c with c.m = o . If we allow Invciock to depend on m.time and yet 
prevent this error, a precondition stronger than that in (4) must be used for field update. 

In Section 5.1, we show how Reset can be correctly programmed without violating 
the invariant of Clock . For now, we continue to focus on the assignment to time as 
motivation for a methodology that justifies the code in Figure 2. 

Leino and Muller’s discipline [LM04], strengthens (4) to yield the following anno- 
tation: 

assert ^this.mv A (Vp | type(p) = Clock • ~>p.inv ); 
this. time := 0; 

Unfortunately, this does not seem to be a very practical solution. How can modular 
specifications and reasoning about an arbitrary instance of Master hope to establish a 
predicate concerning all clocks whatsoever, even in the unlikely event that the predicate 
is true? Given the ownership system, it is also unlikely that an instance of Master 
would be able to unpack any clock that refers to it via its m field and whose inv 
field was true. 

Consider taking what appears to be a step backwards, concerning the Boogie meth- 
odology. We could weaken the annotation in the preceding paragraph to allow the mas- 
ter to perform the field update to time as long as it does not invalidate the invariants of 
any clocks that could possibly be referring to it. 

assert ->this.mv A 

(Vp j type(p) = Clock • -> p.inv V ( Inv C i oc k(p))o his tme )? 
this .time := 0; 

The substitution expression P f, represents the expression P with all unbound occur- 
rences of x replaced by E , with renaming as necessary to prevent name capture. We 
use substitution to express the weakest precondition and assume that aliasing is handled 
correctly 3 . But the revised precondition does not appear to provide any benefit: while 
-ithis.inw is established by the preceding unpack in Reset , there is still no clear 

3 Substitution for updates of object fields can be formalized in a number of ways and the techni- 
cal details are not germane in this paper [A097. FLL + 02, PdB03], In general, object update 
has a global effect, and our aim is to achieve sound localized reasoning about such updates. 
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way to establish either of the disjuncts for arbitrary instances p of Clock . In addition, 
as stated, this proposal has the flaw that it exposes Invciock outside of class Clock . 
We solve both of these problems. Given the following general scheme: 

assert -t E.inv A (Vp, T | ... • -i p.inv V ( Invx{p )) d'* ); 

E.f:=D- 

where the missing condition somehow expresses that type (p) = T and Invx(p) 
depends on E , our methodology provides a way to manage the range of p and a way 
to abstract from (Iuvrip))^ ■ 

In the following three subsections we first deal with restricting the range of p in 
(5). Then we show how to abstract from (InvT(p))c>^ i n (5) to achieve class-oriented 
information hiding. Finally we complete the story about the range of p and redefine 
admissible invariants. 

3.1 Representing Dependence 

The first problem is to determine which objects p have Invciock ( p ) dependent on a 
given instance of Master . (In general, there could be other classes with invariants that 
depend on instances of Master , further extending the range of p needed for sound- 
ness.) To allow for intentional cooperation, we introduce an explicit friend declaration 

friend Clock reads time ; 

in class Master 4 . For a friend declaration appearing in class T' : 

friend T reads/; 

we say T' is the granting class and T the friend. Field / is visible in code and spec- 
ifications in class T . (Read access is sufficient.) There are some technical restrictions 
on / listed in Section 3.4. When Invr{p) depends on o.f for some granting object o , 
then o is reachable from p . For simplicity in this paper, we confine attention to paths 
of length one, so o = p.g for some field g which we call a pivot field. (We also allow 
p.g.f .h in Invr{p) , where h is an immutable field of / , e.g., the length of an array.) 

One of the benefits of our methodology is to facilitate the decentralized formulation 
of invariants which lessens the need for paths in invariants. An example is the condition 
linking adjacent nodes in a doubly-linked list: reachability is needed if this is an invari- 
ant of the list header, but we can maintain the invariant by imposing a local invariant on 
each node that refers only to its successor node; see the example in Section 6.3. 

To further restrict the range of p in (5) to relevant friends, we could explore more 
complicated syntactic conditions, but with predictable limitations due to static analysis. 
We choose instead to use auxiliary state to track which friend instances are susceptible 
to having their invariants falsified by update of fields of a granting object. 

We introduce an auxiliary field deps of type “set of object’’. We will arrange that 
for any o in any state, o.deps contains all p such that p.g = o for some pivot field g 

4 Similar features are found in languages including C++ and C#, and in the Leino-Miiller work. 
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by which Inv(p) depends as friend on some field of o . As with owner , this facilitates 
making explicit the relevant program invariants. Both owner and deps function as 
“back pointers” in the opposite direction of a dependence. 

The associated program invariant is roughly this: 

(Vo : T' • (Vp : T \ p.inv A “ Invr{p ) depends on o./” • p € o.deps )) (6) 

for every T , T' such that T is a friend of T' reading / . It should become clear later, 
when we define admissibility for invariants, that dependence happens via a pivot field. 

It is not necessary for o.deps to contain only the objects p such that Invr{p) 
depends on o.f . 

We have reached the penultimate version of the rule for update of a field with friend 
dependents: 

assert ->E.inv A (Vp | p £ E .deps • *-i p.inv V (Inv type ^(p))^ ); 
E.f:=D- 

A friend declaration could trigger a requirement that field updates in the granting class 
be guarded as in (7) and one could argue that in return for visibility of / in T , Invr 
should simply be visible in T' . This is essentially to say that the two classes are in a 
single module. Our methodology facilitates more hiding of information than that, while 
allowing cooperation and dealing with the problem of the range of p in (7). In the next 
subsection we eliminate the exposure of Inv in this rule, and then in the following 
subsection we deal with reasoning about deps . 

3.2 Abstracting from the Friend’s Invariant 

Our solution is to abstract from (InvT)p ^ not as an auxiliary field but as a predicate U 
(for update guard). The predicate U is declared in class T , and there it gives rise to a 
proof obligation, roughly this: if both the friend object’s invariant holds and the update 
guard holds, then the assignment statement will not violate the friend object’s invariant. 
This predicate plays a role in the interface specification of class T , describing not an 
operation provided by T but rather the effect on T of operations elsewhere. There is 
a resemblance to behavioral assumptions in Rely-Guarantee reasoning for concurrent 
programs [Jon83, dRdBH + 01]. 

In the friend class T it is the pivot field g and the friend field / that are visible, not 
the expressions E and D in an update that occurs in the code of the granting class T' . 
So, in order to define the update guard we introduce a special variable, val , to represent 
the value the field is being assigned: 

guard g.f := val by U (this, g, val) ; 

This construct appears in the friend class and must be expressed in terms that are visi- 
ble to the granting class (thus allowing the friend class to hide its private information). 
We write U (friend, granter, val) to make the parameters explicit. That is, U is de- 
fined in the context of T using vocabulary (this, g. val) but instantiated by the triple 
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(p, E , D) at the update site in a granter method (see (7) and below). For example, the 
update guard declared in the friend class Clock is: 

guard m.time := val by m.time < val; 

Thus U clock (this, m, val) = ( m.time < val) . Notice that this does not appear in 
this particular update guard. That is because, as stated earlier, it does not depend on the 
state of the instance of Clock . 

Like a method declaration, an update guard declaration imposes a proof obligation. 
The obligations on the friend class T are: 

/nwr(this) A U (this, g, val) =» (Invr (this) )®/j (8) 

for each pivot g of type T' and friend field / . A suitable default is to take U to 
be false so that the proof obligation is vacuous. Then the update rule is equivalent to 
that in [LM04], At the other extreme, if, despite the declarations, Inv does not in fact 
depend on the pivot then U can be taken to be true . 

We have now reached the final version of the rule for update of a friend field: 

assert -<E. inv A (Vp | p € E.deps • -> p.inv V U(p,E,D) ); 

E.f:=D- W 

We are now in a position that a field update may be performed without violating the 
invariants of an object’s friends by establishing the precondition 

(Vp | p £ E.deps • -i p.inv V U(p,E,D)) 

where U was written by the author of the class T in such a way that the class T' 
is able to (at least potentially) satisfy it. That is, it is an expression containing values 
and variables that are accessible in the context of T' and need not involve the private 
implementation details of T . 

In the design of class T , some state variables may be introduced and made visible 
to T' precisely in order to express U , without revealing too much of the internal 
representation of T . We pursue this further in Section 5.2. 

For the clock example, Uciock{p, this, time + n) = ( time < time + n) which 
follows easily from precondition 0 < n of method Tick ; thus the update precondition 
can be established independent from any reasoning about deps . On the other hand, 
within the method Reset, Uciock(p, this, 0) = ( time < 0) which does not follow 
from p £ this. deps and the precondition given for Reset . 

Reset should only be allowed if no clocks depend on this master, which would 
follow from deps = 0 according to program invariant (6). We show our discipline for 
reasoning about deps in the next subsection. 

3.3 Notification of Dependence 

To maintain program invariant (6) we force each friend object to register itself with the 
granting object in order to include itself in the granting object’s deps field. Definition 2 
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of admissibility in Section 3.4 requires that Invr satisfies the following, for each pivot 
field g : 

/nttT(this) => g = null V this £ g.deps (10) 

One way to satisfy (10) is to add g = null V this £ g.deps as a conjunct of Invr ■ 
We allow the field deps to be updated only by the special statements attach and 
detach which add and remove an object from this. deps . 

attach E = assert E ^ null A -i inv\ 
deps := deps U {f?}; 

detach E = assert E ^ null A -> E.inv A ~^inv; 
deps := deps — {E}- 

The attach and detach statements are allowed only in the code of the class T' where 
T' declares T to be a friend; their effect is to update this, deps . It is in code of T' that 
we need to reason about deps and thus to use attach . This means that it is incumbent 
upon a friend to call some method in the granter when setting a pivot field to refer to 
the granter. This gives the granter a chance to either record the identity of the dependent 
(see the Subject/View example in Section 6.1) or to change some other data structure 
to reflect the fact that the dependent has registered itself (as in the Clock example, 
completed in Section 3.4). 

Aside 2 One could imagine that attachment is triggered automatically by the assign- 
ment in a dependent to its pivot field. It is possible to work out such a system but it 
has the flaw that the granter is not given a chance to establish and maintain invariants 
about deps . Also, the conjunct ~<E.inv in the precondition to detach is stronger than 
necessary. The alternative is to require either that E is unpacked or that it no longer 
has its pivot field referring to this, but that would require the granter to know more 
about the pivot fields in its friends than we would like. In [NB04 ], we formulate the 
detach statement with the weaker pre-condition. 



3.4 Summary 

To summarize the required annotations and program invariants, we begin with our orig- 
inal example from Figure 2 and rewrite it as shown in Figure 3. The two invariant 
declarations in Clock are conjoined to be the invariant for the class. In the constructor 
for Clock, t must be initialized to zero and the call to m. Connect must occur in order 
to satisfy the class invariant before calling pack . Note that Inv clock now satisfies (10) 
owing to the conjunct this £ m.deps . This conjunct is established in the constructor 
by the invocation in. Connect (this) . In this case. Connect is needed only for reason- 
ing. In most friendship situations the granter needs some method for registering friends 
in order to maintain more information about them. For example, only when the Master 
class maintains concrete program state about each object in its deps field is it possible 
to introduce the Reset method, see Section 5.1. All of the examples shown in Section 6 
also show this pattern. 
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class Master { 
time : int; 

invariant 0 < time; 
friend Clock reads time ; 

Master () 

ensures inv A -> comm; 

{ time := 0; pack this; } 

Tick(n : int) 

requires inv A -> comm A 0 < n; 

modifies time; 

ensures time > old (time); 

{ 

unpack this; 

time := time + n; 

pack this: 

} 

Connect (c : Clock ) 
requires inv; 
ensures c £ this.deps; 

{ 

unpack this; 
attach c; 
pack this; 

} 

} 



class Clock { 
t : int; 

m : Master; 

invariant m ^ null A 0 < t < m.time; 
invariant this £ m.deps; 
guard m.time := val by m.time < val; 
Clock(mast : Master) 
requires mast ^ null A mast. inv; 
ensures inv A ->comm; 

{ 

m := mast; 
t:= 0; 

m. Connect (this) ; 
pack this; 

} 

Sync() 

requires inv A ~^comm; 

modifies t; 
ensures t = m.time; 

{ unpack this; t:= m.time; pack this; } 

} 



Fig. 3. Clocks synchronized with a master clock. Invciock (this) depends on this .m.time but 
a clock does not own this.m . 



To summarize our methodology, we first recall the rule for annotation of held up- 
date, (9). A separate guard Uf is declared for each held / on which a friend depends, 
so the rule is as follows. 

assert -i E.inv A (Vp | p £ E.deps • ~>p.inv V Uf(p,E,D) ); 

E-f '■= D; 

It is straightforward to adapt this rule to cater for there being more than one friend 
class, or more than one pivot held of the same granter type but we omit the details (see 
[NB04]). Here, for simplicity, we disallow multiple pivots of the same type, 

A friend may declare more than one update guard for a given / . Each update guard 

guard g.f := val by U (this, g , val); 

gives rise to a proof obligation to be discharged in the context of the friend class: 

Invr{ this) A 17 (this, 5, val) => (Invr (this) 

For the precondition of a particular update of / in the granter, the reasoner may choose 
any of the update guards given for / . 
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The four auxiliary fields inv, comm , owner, deps may all appear in method spec- 
ifications and assertions, but they are updated only by special statements. 

We refrain from repeating the definitions of pack and unpack, which remain 
unchanged from Section 2. The set-owner statement needs to be revised: a friend 
may be granted access to owner , in which case there needs to be an update guard for 
owner just like for ordinary fields: 

set-owner E to D = 

assert E ^ null A -> E.inv A (D = null V -> D.inv); 
assert (Vp | p £ E.deps • -> p.inv V U owner (p, E , D) ); 

E. owner := D; 

Note that if D is an object, it must be unpacked as its invariant is at risk when E 
becomes owned. 



Definition 2 (admissible invariant). An invariant Invr(o) is admissible just if for 
every X ./ on which it depends, f ^ inv , f comm, and either 

- X is o (in the formula that means X is this); 

- X is transitively owned by o and f ^ deps ; or 

- X is o.g where field g (called a pivot) has some type T' that declares “friend T 
reads f 

Moreover, the implication 

InVT(o) => g = null V o £ g.deps (11) 



must be valid. Finally, Invr(o) is not falsifiable by creation of new objects. 

We write = for syntactic equality, not logical equivalence. 

There are easy syntactic checks for the ownership condition, e.g., it holds if X 

has the form g.h ,j where each is a rep field, or if X is variable bound by 

(VI | X. owner = o • . . . ) . Requirement (11) is met by including either the 

condition this £ g.deps or the condition g = null V this £ g.deps as a conjunct of 
the declared invariant. (A fine point is that an admissible invariant should only depend 
on deps in this way; see [NB04].) Although we do not use it in this paper, it is possible 
to have a pivot tag that marks the fields in the friend class that appear in the friend’s 
invariant. Then there would be an easy syntactic process for imposing the requirement 
and allowing no other dependence on deps . 

We extend the three program invariants (1-3) with a fourth invariant. Taken to- 
gether, they ensure the following, for all o, T,f with type(o) = T . 



o.inv => Invr(o) 

o.inv => (Vp | p. owner = o • p.comm ) 
o.comm => o.inv 

For every T' , g,p such that type(p) = T' and Invr' depends on pivot g 

p.g = o A p.inv => p £ o.deps 



( 12 ) 

(13) 

(14) 

(15) 
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It is the first invariant that is the key to the entire methodology. It abstracts an object’s in- 
variant, preserving data encapsulation and allowing flexibility for reentrancy. The other 
invariants are all mechanisms needed in order to maintain (12) in the presence of inter- 
object dependence. The second and third work within ownership domains while our 
contribution adds cooperating objects. 

4 Soundness 

Consider any program annotated with invariants, friend declarations, and update guards 
satisfying the stated restrictions. We confine attention to the core methodology summa- 
rized in Section 3.4. Suppose that the obligations are met: the invariants are admissible 
and the update guard obligations are satisfied. Suppose also that every field update is 
preceded by the stipulated assertion, or one that implies it. We claim that (12-15) are 
program invariants, that is, true in every state. We refrain from formalizing precisely 
what that means, to avoid commitment to a particular verification system or logic. 

A detailed formal proof of soundness for the full methodology is given in a com- 
panion paper [NB04]. An informal argument has been given for the features already 
present in the previous Boogie papers [BDF + 03a, LM04]; we have augmented the pre- 
conditions used in those papers. We consider highlights for the new features. 

Consider first the new invariant (15), and the statements which could falsify it. 

- pack sets p.inv, but under the precondition Inv(p), and by admissibility this 
implies p £ o.deps for any o on which p has a friend dependence. 

- new initializes deps = 0 but also inv = false . By freshness, no existing object 
has an owner or friend dependency on the new object. 

- A field update E.f := D can falsify it only if / is a pivot of E , but this is done 
under precondition -^E.inv . 

- detach removes an element from this. deps but under precondition -this.mi> . 

Invariants (13) and (14) do not merit much attention as they do not involve the new 
fields and the new commands attach and detach do not involve inv or comm . 

For (12), we must reconsider field update, E.f := E' , because Invr(o) can have 
friend dependencies. By invariant (15), if o is a friend dependent on E , either -> o.inv 
or o £ E.deps . In the latter case, the precondition for update requires t//(o, E , E') . 
The proof obligation for this update guard yields that Invr{o) is not falsified by the 
update. 

Both attach and detach have the potential to falsify (12) insofar as object in- 
variants are allowed to depend on deps fields. A local dependence on this. deps is 
no problem, owing to precondition — this. irtv . An admissible invariant is not allowed 
to depend on the deps field of an owned object. What about friends? An admissible 
invariant is required to depend on g.deps for each pivot g , but in a specific way that 
cannot be falsified by attach and that cannot be falsified by detach under its pre- 
condition. Finally, the detach E statement has the potential to falsify the consequent 
in (15), and this too is prevented by its precondition that either -<E.inv or E has no 
pivots referring to this . The intricacy of this interdependence is one motivation for 
carrying out a rigorous semantic proof of soundness. 
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5 Extensions 

In this section, we present two extensions. The first is a methodology for creating an 
invariant that eases the burden of reasoning about the deps field in the granting class. 
The second is a syntactic extension to the update guard that provides extra information 
to the granting class after it performs a field update. 

5.1 Tracking Dependencies in Invariants 

We look again at the Reset method in Clock . In order to set time to zero, an instance 
of Master must know either that each of the clocks referring to it have their value of 
t also as zero or that there are no clocks referring to it. By program invariant (15), the 
latter case is true when deps is empty. For this example, it suffices for the master clock 
to maintain a reference count, clocks , of the clocks that are referring to it via their 
field m , incrementing it each time attach is executed and decrementing it upon each 
detach. That is, variable clocks maintains the invariant clocks = size(deps) . Given 
that invariant, the precondition for the update to time in Reset can be that clocks is 
equal to zero. 

In general, we refer to the invariant that the granting class maintains about its deps 
variable as Dep . The invariant must be strong enough to derive enough information 
about all objects p £ deps to establish the precondition in (9). Thus we formulate Dep 
as a predicate on an element of deps and introduce the following invariant as a proof 
obligation in the granting class. 

(Vp | p £ deps • Dep(this,p) ) (16) 

As with U , we make this an explicit parameter in the declaration. 

We extend the friend syntax in the granting class to define Dep : 

friend x : T reads / keeping Dep(this, x ) 

It binds x in predicate Dep which may also depend on state visible in the granting 
class. The default is Hep (this, x) = true, easing the obligation but providing no help 
in reasoning about deps . Like any invariant, Dep cannot depend on inv or comm . 
In terms of the verification of the granting class, the effect is to conjoin (16) to any 
declared invariant. 

Figure 4 shows a version of Master with Reset . Note that in the constructor, the 
value of clocks must be set to zero in order to establish the “keeping” predicate, since 
initially deps is empty. The preconditions for Connect and Disconnect restrict the 
value of deps in order to keep an accurate count of the number of clocks referring to 
the master clock. Class Clock need not be revised from Figure 3. 

In this example, Dep is independent of the individual identities of the friend ob- 
jects. The Subject/View example (Section 6.1) shows a more typical use of Dep . 

5.2 Getting Results from Friendship 

In contrast to the fixed pack/unpack/ inv protocol which abstracts Inv(p) to a boolean 
field, we have formulated the friend-invariant rule in terms of a shared state predicate. 
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class Master { 
time : int; 
clocks : int; 
invariant 0 < time ; 

friend c : Clock reads time keeping clocks = size(deps); 
Master () 

ensures inv A ->comm; 

{ time := 0; clocks := 0; pack this; } 

Tick(n : int) 

requires inv A -> comm A 0 < n; 

modifies time ; 

ensures time > old (time); 

{ unpack this; time := time + n; pack this; } 

Reset () 

requires inv A clocks = 0; 
modifies time; 

{ unpack this; time = 0; pack this; } 

Connect(c : Clock) 
requires inv A c (f: deps; 
modifies clocks; 
ensures c £ deps; 

{ unpack this; clocks := clocks + 1; attach c; pack this; } 
Disconnect (c : Clock) 
requires inv A c £ deps; 
modifies clocks; 
ensures c (f: deps; 

{ unpack this; clocks := clocks — 1; detach c; pack this; } 

} 



Fig. 4. Master clock with reset. 



The associated methodology is to introduce public (or module-scoped) state variables 
with which to express U . Minimizing the state space on which U depends could fa- 
cilitate fast protocol checking as in Fugue [DF01, DF03]. 

Whereas invariants are invariant, states get changed. The proposal so far is that the 
public interface of the dependent class T should reveal information about changes 
relevant to T . Given that T publishes the condition under which shared state may be 
changed, why not also publish the effect of such changes? 

We extend the update guard declaration to include predicate Y for the result state: 

guard g.f := val by U (this, g, val) yielding Y (this, g , val); 

The proof obligation on the friend class becomes 

Invr{ this) A (/(this, g, val) => (Invr{ this) A F(this, g, val))®^ 

Note the resemblance to a pre/post specification in which the invariant is explicit. 
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At a field update site in the granting class, the yielding predicate can be depended 
on after the update: 

assert -lE.inw, 

assert (Vp j p £ E.deps • -> p.inv V U(p,E,D) ) 

E.f := D 

assume (Vp | p € E.deps • ~>p.inv V Y(p,E,D) ) 

The predicates U and Y are likely to be useful in specifications of methods of T . 
Together with method specifications, the guard/yielding statements of a class give 
the protocol by which it may be used. 



6 Examples 

In this section, we present several examples with some, but not all, of the details of their 
verification. The Subject/View example 6.1 demonstrates the enforcement a behavioral 
protocol. In Section 6.2, the cooperation involves the use of a shared data structure. 
Finally, Section 6.3 illustrates how the peer concept[LM04] mentioned in Aside 1 can 
be easily encoded as a friendship relation. 

6.1 Subject/View 

In Figure 5, the class Subject represents an object that maintains a collection of objects 
of type View that depend on it. We refer to the object of type Subject as the subject 
and each object that it holds a reference to in its collection vs as a view. In particular, 



class Subject { 
val : int; 
version : int; 

rep vs : Collection(View); 

friend v : View reads version, val keeping v £ this. vs 
Update(n : int) 

requires inv A -i comm A (Vo £ vs • v.inv A ->v.comm A Sync(v, this)); ) 
modifies val, version ; 

ensures val = n A version = old {version) + 1 A (Vo £ re • Sync(v, this) ); 

{ 

unpack this; 

version := version + 1; 
val := n; 

pack this; 

foreach v £ vs do v. notify (); 

} 

} 



Fig. 5. The class Subject . 
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each view depends on the fact that whenever the state of the subject, represented by 
the val field (which could be a much more elaborate data structure), is changed in 
the method Update , then it will receive a call to its Notify method. As part of its 
Notify method, a view will make callbacks to its subject to retrieve whatever parts of 
the updated state it is interested in. We do not show these state-querying methods (also 
known as obsen’ers). 

To express the synchronization, the subject maintains a field version which indi- 
cates the number of times that Update has been called. A view also keeps track of a 
version number, vsn ; a view is up to date if its version matches that of its subject. 

In this example, the method Update requires that the views be uncommitted so that 
they can be re-synchronized using their Notify method. This is much easier to establish 
than the requirement that they be unpacked. For example, it is sufficient for the views 
to be peers of the subject, i.e., that they have the same owner. 

Note that the subject packs itself before calling Notify for all of its views. The 
views are then free to make state-observing calls on the subject, all of which presumably 
have a precondition that inv holds for the subject. Yet it is very important to realize 
that Update is safe from re-entrant calls while it is in the middle of notifying all of the 
views, because a view would not be able to establish the pre-condition that all of the 
views are in sync with the subject. It is only after the method Update has terminated 
that a view can be sure all of the views have been notified, and if it makes a re-entrant 
call, then that would come before Update terminates. 

The exception to this is if a view somehow knew that it was the only view for the 
subject. But in that case, a re-entrant call to Update does not cause any problems with 
the synchronization property. It still can lead to non-termination, but that is outside of 
the scope of our specification. 

In Figure 6, the class View publishes an update guard and update result for updates 
by the subject to its version field, and an update guard without an update result for 
modifications to the subject’s val field. The guards given are not the weakest possible, 
but rather are chosen to avoid exposing internal state. We define Sync and Out as: 

Sync{x : View , y : Subject) = x.vsn = y. version 
Outfx : View , y : Subject) = x.vsn + 1 = y. version 

Even though the class Subject uses View ’s field vsn in the precondition and post- 
condition of Update , View does not have to declare it as a friend class. However, the 
field must be accessible in the scope of the class Subject , e.g., by being public. To keep 
control of it, the class View could define a read-only property [GunOO] and make the 
field itself private. We leave such details out of our examples. The invariant for the class 
is the conjunction of the two separately declared invariants. 

The formal definitions for the update guards are: 

U ve rsion(x , y, z) = Sync(x, y) A z = x.vsn + 1 
U va i{x , y, z) = ~^Sync(x, y) 

Note that because of the implication in Inv view . the update guard for s.val is written 
so as to falsify the antecedent; the guard is independent of z , which represents the 
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class View { 

private s : Subject; 

vsn : int; 

private cache : int; 

invariant s. version 1 < vsn < s. version A (5't/nc(this, s) =3* cache = s.val); 

invariant s = null V this £ s.deps ; 

guard s. version := val by S't/nc(this, s) A val = vsn + 1 yielding 0«£(this, s); 

guard s.val := val by -i5't/nc(this, s); 

Notify() 

requires -icomm A inv A s.inv A Oiti(this, s); 

ensures S)/nc(this, s); 

modifies vsn; 

{ 

unpack this; 

vsn := vsn + 1; 

“read state from s, update cache” ; // This is why s. inv was required. 

pack this; 

} 

} 

Fig. 6. The class View . 

value assigned to the field. This enforces a restriction on the order in which the Subject 
can update the fields, even though the reverse order has equivalent effect. The Subject 
must first update its version field to make the implication vacuously true, and only 
then update its val field. 

Allowing the View to impose this requirement on Subject seems unfortunate, es- 
pecially since the Subject has unpacked itself at the beginning of Update and so it 
would seem it should be able to update its fields in any order as long as it can re-establish 
its invariant before it tries to pack itself again. The example illustrates the price to be 
paid for the Boogie approach. Having the program invariants hold at “every semicolon” 
is conceptually simple and technically robust, but like any programming discipline this 
one disallows some programs that are arguably correct and well designed. If an incon- 
sequential ordering of two assignments is the only annoyance then we are doing very 
well indeed. 

We demonstrate the verification of the proof obligations imposed by our method- 
ology. In the granting class Subject , the assert before each of the two field updates in 
Update must be satisfied (see (9)). We also have to show that the Dep predicate holds 
for every member of deps (see (16)); i.e., we have to show that the following condition 
is invariant; 

{y p | p £ deps • p € vs ) (17) 

To verify the assignment to version in Subject. Update , the corresponding update 
guard from View must be satisfied. 

(y p | p £ this. deps • -> p.inv V U vers i on (p, this, this. version + 1) ) 

= {Definition of guard U version } 

(V p j p £ deps • -i p.inv V (Sync(p, this) A this. version + 1 = p.vsn + 1) ) 

4= {Strengthening} 
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(Vp | p £ deps • Sync(p, this) A this. version + 1 = p.vsn + 1 ) 

<= (( 17 )} 

(y p | p € vs • Sync(p, this) A this. version + 1 = p.vsn + 1 ) 

4= (pre-cond. of Update : Sync(p, this), which implies p.vsn = this. version} 
(Vp | p £ vs • this. version + 1 = this. version + 1 ) 

4= true 

This fulfills the proof obligation (9). 

To verify the assignment to val in Sub ject. Update , we use the update guard in 
View for val . 



(Vp | p £ this. deps • ~>p.inv V U va i(p, this, n) ) 

= {Definition of guard U va i) 

(Vp | p £ this .deps • -> p.inv V ->5)/nc(p, this) ) 

<= {(17)} 

(Vp | p £ vs • -i p.inv V ~^Sync(p, this) ) 

4= {By definition, Out(x,y) => ~^Sync(x, y)} 

(Vp j p £ vs • -i p.inw V OrV(p, this) ) 

The last line is a precondition for the update of val , owing to the yielding clause in 
the update guard for version (see Section 5.2). This fulfills the proof obligation (9). 

In order to maintain invariance of (17), while allowing dependent views, Subject 
can provide a method with an attach statement: 

Register{v : View) 

requires -icomm A inv A v ^ vs; 
ensures v £ vs; 
modifies vs; 

{ 

unpack this; 

vs := vs + {i)}; 

attach v; 
pack this; 

} 

Clearly, this makes (17) an invariant, since there are no other occurrences of attach 
that could modify the value of deps . In Figures 5 and 6 we omit constructors; the 
constructor of View would call Register . 

The obligations on the friend class View are that its advertised update guards main- 
tain its invariant (see (8)) and that it is in the deps field of the subject upon which it 
is dependent (see (10)). The condition required by (10) is a declared conjunct of the 
invariant of View . 

Each update guard in View must be shown to fulfill the obligation of (8), that its 
invariant will not be violated as long as the guard holds. Here we show only the update 
guard for version , the one for val is even easier. 
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( Invview A Sync(this, s) A val = this . vsn + 1) => ( Invview)^i rston 
= {Simplifying 5t/nc(this, s) A Invview} 

(this. vsn = s.version A val = this. vsn + 1) =>■ (Invyiew)v '^ rslon 

= {Substitution} 

( vsn = s.version A val = vsn + 1) => 

val — 1 < vsn < val A ( vsn = val =>■ cache = s.val) 

<= {Simplification} 
true 

6.2 Producer/Consumer 

In this example, we show two objects that share a common cyclic buffer. There are 
two classes, Producer and Consumer . Their definitions are shown in Figure 7 and 
Figure 8, respectively. 



class Producer { 

buf : int[ ]; n : int; con : Consumer ; 
invariant 0 < n < buf .length; 

friend o : Consumer reads con, n, buf keeping o = con ; 

Producer(b : int [ ] ) 

requires b ^ null A b.length > 1; 
ensures deps = 0 A inv A -> comm Att = 0; 

{ buf := b; n := 0; pack this; } 

SetCon(c : Consumer) 

requires inv A ->comm A null A deps = 0; 
modifies con; 

ensures deps = {c} A con = c; 

{ unpack this; attach c; con := c; pack this; } 

Produce (x : int) 

requires inv A -> comm A con ^ null A con.n ^ n; // con.n = n = buffer full 
modifies n, buf; 

ensures n £ [old(n)..old(con.n)]; 

{ unpack this; buf[n % buf .length] := x; n := (n + 1) % buf .length; pack this; } 



Fig. 7. The class Producer . 



We call instances of the former producers and instances of the latter consumers. A 
producer places elements into a circular buffer while consumers read them. Each object 
maintains a cursor into the common buffer; the producer can place more elements into 
the buffer as long as it does not overrun the consumer. Likewise, the consumer can 
only read elements from the buffer as long as its cursor does not overrun the producer’s 
cursor. The buffer is empty when the producer’s cursor is one element ahead (modulo 
the buffer length) of the consumer’s cursor. When the cursors are equal, then the buffer 
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class Consumer { 

buf : int[ ]; n : int; pro : Producer ; 

invariant pro. con = this A buf ^ null; 
invariant 0 < n < buf .length', 

guard pro. buf := val by false; 
guard pro. con := val by val = this; 
guard pro.n := val by val £ [ pro.n..n ); 

Consumer(p : Producer ) 
requires p. inv A -> p.comm A p.con = null; 

modifies p.con', 
ensures inv 

{ buf := p.buf; pro := p; n:= buf .length — 1; pro.SetCon( this); pack this; } 
Consume () : int 

requires inv A -icomm A (n + 1) % buf .length < pro.n', 

modifies n; 

ensures n £ [old(n)..old(pro.n)); 

{ unpack this; n := (n + 1) % buf .length; pack this; return (buf[n\); } 

} 



Fig. 8. The class Consumer . 



is full. Because of this encoding, the buffer’s length must be greater than one and its 
capacity is one less than its length. In the specifications we use the notation \i..j) for the 
open interval between i and j , allowing for possible “wraparound” due to the modular 
arithmetic, i.e., if n is 5, then [3..1) is [3, 4, 0] . Similarly, [i..j] stands for the closed 
interval. 

It is important to note that this is not a full specification of the functional behavior 
of the two classes. The specification is only of the synchronization between the two, 
just as was done for the Subject/View example. For schematic patterns this is especially 
useful; the specification can be combined with a particular usage of the pattern to fill 
out the details. 

The class Consumer is given friend access to buf , con , and n . Being given 
access to buf does not give the consumer the right to depend on the contents of buf in 
its invariant. Such a dependence would be a dependence path of length two: one step to 
buf and then to some index i . We do not allow this; we allow only direct dependence 
on a pivot field. 

The friend access for buf is given to the consumer because it needs to make sure 
the producer does not update the field to a new, different buffer. This is expressed by the 
update guard for pro. buf being false . It is possible to allow the producer to change 
its buffer, either by requiring that the buffer is empty, or even to allow the consumer to 
continue reading from the old buffer as long as the producer no longer is using it. We 
leave these variations as an exercise for the reader. 

The update guard for con is slightly different: it allows the producer to modify 
the field, but only to assign the consumer to it. The update guard for the producer’s 
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cursor, n , allows the producer to fill as many slots as are available, even though in this 
particular implementation, the producer fills only one slot at a time. 

We do not show the proofs for the field updates in Producer ; all of the proofs are 
immediate. 



6.3 Doubly-Linked List with Transfer 

For our last example, we consider a doubly-linked list. The class List with its Insert 
and Push methods is shown in Figure 9. Each List object has a reference to an object 



class List { 
head : Node', 

invariant head = null V ( head.prev = null A head. owner = this); 

Insert(x : int) 

requires i>0A inv A ->comm; 

modifies ; 
ensures ; 

{ n : Node; n := new Node(x); this. Push(n); } 

Push(n : Node) 
requires inv A ncomro; 

requires n ^ null A n.prev = null A n.next = null; 
requires n.inv A -> n.comm A n. owner = null; 
modifies head , n.comm, n. owner; 
ensures n.comm A n. owner = this; 

{ 

unpack this; 
set-owner n to this; 

if ( head = null) head := n; else head := head.Insert(n); 

pack this; 

} 

} 

Fig. 9. The class List . The design caters for a method that transfers ownership of a node, to be 
added in Figure 12. 



of type Node ; the nodes have forward and backward references to other Node objects. 
In [LM04], this example serves to explain the concept of peers: objects who share a 
common owner. Remember by Definition 2 that an object’s invariant is allowed to de- 
pend on its peers. The class Node is shown in Figure 10. Because each of the nodes that 
are linked into a particular list share the same owner, if a node is able to pack and unpack 
itself, then it is also able to do that for any other node in the list. In terms of our method- 
ology, this means no update guards are needed. Instead, the recursive friend access is 
needed so that a node’s invariant can depend on the node to which its next field points. 
The keeping clause maintains that a node keeps a reference to its friend in its prev 
field. Thus the quantification in the precondition for field update can be simplified by 




Friends Need a Bit More: Maintaining Invariants Over Shared State 



79 



class Node { 
val : int; 
prev : Node-, 
next : Node-, // pivot field 

friend n : Node reads prev, owner keeping n = prev 
invariant 0 < val A prev 7 ^ this A 

( next = null V (next. owner = owner A next. prev = this)); 
Node(x : int) 
requires 0 < x; 

ensures val = x A inv A prev = null A next = null; 

{ val := x\ prev := null; next := null; pack this; } 

} 

Fig. 10. Part of the class Node . Other methods are in subsequent Figures. 



the one-point rule. Notice that within the outer “else” clause of Insert (Figure 1 1 ), we 
unpack the argument n so that we can assign to its pivot field next without worrying 
about violating InvN 0 de(n ) • All of the conditions required before packing it back up 
are met through a combination of the (rather elaborate) pre-conditions on the method 
and the assignments that take place in the body of the method. We do not show the 
details; all of the required conditions are immediately present. 

The Insert, method, in Figure 11, returns a value; the post-condition refers to the 
return value as result . The modifies clause uses a simple path notation to indicate that 
Insert may modify the fields next and prev on nodes that are reached by following 
a path of accesses along the next field. In the former case, the path may be of zero 
length, while in the latter case the path must be of positive length. 

To add realistic complications to the code, the list is maintained in ascending order 
and if desired this could be expressed using node invariants, again avoiding reachability 
expressions. 

Figure 12 shows an example of transferring ownership. In this case, the first node in 
one list is moved to another list, s . It is important to see that it transfers the actual object 
of type Node , as well as the contents of the node. The helper function. Disconnect , 
removes a node from the entanglements of the pointers in its list and maintains deps . 



7 Related Work 

The most closely related work is that of Leino and Muller [LM04] which uses an explicit 
owner field that holds a pair ( o,T ) of the owner together with the type T at which o 
has a relevant invariant. The paper by Miiller et al. [MPHL03] lucidly explains both the 
challenges of modular reasoning about object invariants and the solution using owner- 
ship. They prove soundness for a system using types to formulate ownership, based on 
Miiller’s dissertation [Miil02] which deals with significant design patterns in a realistic 
object-oriented language. They also discuss the problem of dependence on non-owned 
objects and describe how the problem can be addressed soundly by ensuring that an 
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Insertion : Node) : Node 
requires inv A -icorom; 

requires n 7^ null A n.val > 0 A n.next = null A n.prev = null; 

requires n.inv A ->n.comm; 

requires prev = null V prev.val < n.val ; 

requires owner = n. owner; 

modifies next* .next, next + .prev; 

ensures result 7^ null A result. prev = old(this.prer); 

{ 

result : Node-, 

unpack this; 

if ( n.val > val) {// insert after self 

if ( next = null) {// this is the last node 

next := n; 

} else {// pass it down the line 

next := next.Insert(n); 

} 

unpack next; 
next.Attach(this); 
pack next; 
result := this; 

} else {// insert before self 
unpack n; 
n.next := this; 
this.Affac/i(n); 
pack n; 
result := n; 

} 

pack this; 
return result; 

} 

Attach(n : Node) 
requires -1 inv A n 7^ null; 
ensures prev = n A n £ deps; 
modifies deps, prev; 

{ attach n; prev := n; } 

Detach{n : Node) 

requires -1 inv A n 7^ null A -1 n.inv ; 
ensures prev = null A n ^ deps; 
modifies deps, prev; 

{ detach n; prev := null; } 

Fig. 11. The methods Insert, Attach, and Detach in class Node. 



object’s invariant is visible where it may be violated; thus sound proof obligations can 
be imposed, as is developed further in [LM04]. Section 2 has reviewed [LM04] and the 
other Boogie paper [BDF + 03a] at length and we encourage the reader to consult them 
for further comparisons. 
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Disconnect() 

requires inv A prev = null A -> comm A next 7^ null A next.inv; 
ensures next = null A old(nearf) .prev = null; 
modifies next, next. prev, next.deps; 

{ 

unpack this; 
unpack next', 
next.Detach(this); 
pack next', 
next := null; 
pack this; 

} 

TransferHeadTo(s : List) 

requires s 7^ this A head 7^ null A -1 comm A inv A -1 s.comm A s.inv; 

{ 

unpack this; 

n : Node ; n:= head', head := head. next; 
if ( n.next 7^ null) n. Disconnect (); 

set-owner n to null; 
pack this; 

s.Push(n); 

} 



Fig. 12 . Methods TransferHeadTo and Disconnect in class List . 



Object invariants are treated in the Extended Static Checking project, especially 
ESC/Modula-3 [DLNS98, LN02, FLL + 02], by what Miiller [Mul02] calls the visibility 
approach which requires invariants to be visible, and thus liable for checking, wherever 
they may be violated. This can significantly increase the number of proof obligations 
for a given verification unit and the focus of the work is on mitigation by abstraction. 
An idiom is used for expressing invariants as implications valid => ... where valid 
is an ordinary boolean field, serving like inv . 

Liskov, Wing, and Guttag [LG86, LW94] treat object invariants but in a way that is 
not sound for invariants that depend on more than one object. There has been a lot of 
work on alias control to circumscribe dependency. Ownership type systems [CNP01, 
ClaOl] explicitly address the problem of encapsulating representation objects on which 
an invariant may sensibly depend. Much of this line of work struggles to reconcile effi- 
cient static checking with the challenges of practical design patterns. Boyapati, Liskov 
and Shrira [BLS03] argue that their variation on ownership types achieves encapsula- 
tion sufficient for sound modular reasoning but they do not formalize reasoning. They 
exploit the semantics of inner objects in Java which provides a form of owner field but 
suffers from semantic intricacies and precludes ownership transfer. 

Banerjee and Naumann [BN02a] use a semantic formulation of ownership in terms 
of heap separation and show that it ensures preservation of object invariants. They fo- 
cus on two-state invariants, i.e., simulation relations, to obtain a representation inde- 
pendence result. For this purpose, read access by clients is restricted. The ownership 
property is enforced by a static analysis that does not impose the annotation burden of 
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ownership types but like ownership types it requires the ownership invariant to hold 
in every state. A version has been developed that includes transfer of ownership, but 
it depends on a static analysis for uniqueness and the proof of soundness was diffi- 
cult [BN03]. The representation-independence theorem states that the invariant of a 
class T is preserved by clients if it is preserved by methods of T . The theorem allows 
invocation of state-mutating methods on pointers outgoing from encapsulated represen- 
tation objects, including reentrancy. Unlike work such as [MPHL03], the problem of 
verifying methods of T is not addressed. 

Separation logic [Rey02] uses new logical connectives to express very directly that 
a predicate depends only on some subset of the objects in the heap. It has successfully 
treated modular reasoning about an object invariant in the case of a single class with 
a single instance [OYR04], Some of the attractive features are achieved in part by a 
restriction to a low-level language without object-oriented features, e.g., the atomic 
points-to predicate describes the complete state of an object. This is an exciting and 
active line of research and it will be interesting to see how it scales to specifications and 
programs like those in Section 6. 

8 Conclusions 

Formal systems for programming must always cope with the conflict between the flex- 
ibility real programs display and the restrictions formal analysis demands. Our work 
extends Boogie’s system for object invariants to cope with a real-world situation: de- 
pendence across ownership boundaries. We have constructed a protocol that imposes 
minimal obligations upon the participating classes; it is inevitable that there are some 
extra verification conditions. In addition, we have tried to maintain Boogie’s mantra: 
hiding private implementation details while providing explicit knowledge about the 
state of an object’s invariant. Our contribution is a workable system for specifying and 
verifying cooperating classes. 

While one approach would be to allow, or even to insist, for cooperating classes 
to be knowledgeable about each other’s private implementation state, we believe that 
is important to provide for as much abstraction as possible. The protocols could all be 
expressed in terms of more abstract properties instead of concrete fields allowing a class 
implementation to change without disturbing its friend classes. 

Our presentation has left out all mention of sub-classing, but the actual definitions 
have all been made taking it into account. 

There are many ways in which we plan to extend our work. For instance, our 
methodology could be presented independently from ownership. Currently, we think 
it best to use ownership where possible and thus it is important that friendship fits well 
with ownership. We also need to explore the use of static analysis for alias control in 
common cases. 

Our update guards are related to constraints [LW94]; it would be interesting to 
formulate them as constraints, thus shifting more of the burden to the granting class 
instead of the friend class. 

We will continue to explore different design decisions to weaken the obligations. 
The tradeoffs are between being able to easily verify the specifications and code against 
allowing the most flexibility for the programmer. 
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We are implementing our scheme as part of the Boogie project. Empirical evaluation 
will doubtless point out many problems and opportunities for improvement. 
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Abstract. This work is a case study in program verification: We have 
written a simple parser and a corresponding pretty-printer in a non- 
strict functional programming language with lifted pairs and functions 
(Haskell) . A natural aim is to prove that the programs are, in some sense, 
each other’s inverses. The presence of partial and infinite values in the 
domains makes this exercise interesting, and having lifted types adds an 
extra spice to the task. We have tackled the problem in different ways, 
and this is a report on the merits of those approaches. More specifically, 
we first describe a method for testing properties of programs in the pres- 
ence of partial and infinite values. By testing before proving we avoid 
wasting time trying to prove statements that are not valid. Then we 
prove that the programs we have written are in fact (more or less) in- 
verses using first fixpoint induction and then the approximation lemma. 



1 Introduction 

Infinite values are commonly used in (non-strict) functional programs, often 
to improve modularity [5]. Partial values are seldom used explicitly, but they 
are still present in all non-trivial Haskell programs because of non-termination, 
pattern match failures, calls to the error function etc. Unfortunately, proofs 
about functional programs often ignore details related to partial and infinite 
values. 

This text is a case study where we explore how one can go about testing 
and proving properties even in the presence of partial and infinite values. We 
use random testing (Sect. 5) and two proof methods, fixpoint induction (Sect. 7) 
and the approximation lemma (Sect. 8), both described in Gibbons’ and Hutton’s 
tutorial [4], 

The programs that our case study revolves around are a simple pretty-printer 
and a corresponding parser. Jansson and Jeuring define several more complex 
(polytypic) pretty-printers and parsers and prove them correct for total, finite 
input [7]. The case study in this paper uses cut down versions of those programs 
(see Sect. 2) but proves a stronger statement. On some occasions we have been 
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as part of the research programme “Cover - Combining Verification Methods in 
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tempted to change the definitions of the programs to be able to formulate our 
proofs in a different way. We have not done that, since one part of our goal is 
to explore what it is like to prove properties about programs that have not been 
written with a proof in mind. We have transformed our programs into equivalent 
variants, though; note that this carries a proof obligation. 

Before starting to prove something it is often useful to test the properties. 
That way one can avoid spending time trying to prove something which is not 
true anyway. However, testing partial and infinite values can be tricky. In Sect. 5 
we describe two techniques for doing that. Infinite values can be tested with the 
aid of the approximation lemma, and for partial values we make use of a Haskell 
extension, implemented in several Haskell environments. (The first technique is 
a generalisation of another one, and the last technique is previously known.) 

As indicated above the programming language used for all programs and 
properties is Haskell [12], a non-strict, pure functional language where all types 
are lifted. Since we are careful with all details there will necessarily be some 
Haskell-specific discussions below, but the main ideas should carry over to other 
similar languages. Some knowledge of Haskell is assumed of the reader, though. 

We begin in Sect. 2 by defining the two programs that this case study focuses 
on. Section 3 discusses the computational model and in Sect. 4 we give idealised 
versions of the main properties that we want to prove. By implementing and 
testing the properties (in Sect. 5) we identify a flaw in one of the them and we 
give a new, refined version in Sect. 6. The proofs presented in Sects. 7 and 8 are 
discussed in the concluding Sect. 9. 

2 Programs 

The programs under consideration parse and pretty-print a simple binary tree 
data type T without any information in the nodes: 

data T = L\B T T 



The pretty-printer is really simple. It performs a preorder traversal of the 
tree, emitting a ’B’ for each branching point and an ’L’ for each leaf: 

pretty' :: T — » String 
pretty' L = "L" 

pretty' (B l r) = "B" 4f pretty’ l -H- pretty ' r 

The parser reconstructs a tree given a string of the kind produced by pretty ' . 
Any remaining input is returned together with the tree: 

parse :: String — * (T, String) 
parse (’L ’ : cs) = (L, cs) 
parse (’B’ : cs) = (B l r, cs") 
where ( l , cs') = parse cs 
(r, cs") = parse cs' 
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We wrap up pretty' so that the printer and the parser get symmetric types: 

pretty :: (T, String ) — > String 
pretty (t,cs) = pretty' t 4f cs 

These programs are obviously written in a very naive way. A real pretty- 
printer would not use a quadratic algorithm for printing trees and a real parser 
would use a proper mechanism for reporting parse failures. However, the pro- 
grams have the right level of detail for our application; they are very straightfor- 
ward without being trivial. The tree structure makes the recursion “nonlinear”, 
and that is what makes these programs interesting. 

3 Computational Model 

Before we begin reasoning about the programs we should specify what our un- 
derlying computational model is. We use Haskell 98 [12], and it is common to 
reason about Haskell programs by using equational reasoning, assuming that a 
simple denotational semantics for the language exists. This is risky, though, since 
this method has not been formally verified to work; there is not even a formal 
semantics for the language to verify it against. (We should mention that some 
work has been done on the static semantics [3].) 

Nevertheless we will follow this approach, taking some caveats into account 
(see below) . Although our aim is to explore what a proof would look like when all 
issues related to partial and infinite values are considered, it may be that we have 
missed some subtle aspect of the Haskell semantics. We have experimented with 
different levels of detail and believe that the resolution of such issues most likely 
will not change the overall structure of the proofs, though. Even if we would 
reject the idea of a clean denotational semantics for Haskell and instead use 
Sands’ improvement theory [13] based on an operational model, we still believe 
that the proof steps would be essentially the same. 

Now on to the caveats. All types in Haskell are (by default) pointed and lifted; 
each type is a complete partial order with a distinct least element J_ (bottom), 
and data constructors are not strict. For pairs this means that J_ yf (_L,_L), so 
we do not have surjective pairing. It is possible to use strictness annotations 
to construct types that are not lifted, e.g. the smash product of two types, for 
which _L = (_L, J_) but we still do not have surjective pairing. There is however 
no way to construct the ordinary cartesian product of two types. 

One has to be careful when using pattern matching in conjunction with lifted 
types. The expression let (a, b) = x in g (a, b) is equivalent to g x iff x yf J_ 
or g (_L,_L) = g JL. The reason is that, if x = _Lj then in the first case g will 
still be applied to (_L,_L), whereas in the second case g will be applied to _L. 
Note here that the pattern matching in a let clause is not performed until the 
variables bound in the pattern are actually used. Hence let (a, b) = J_ in (a, b ) 
is equivalent to (_L, JL), whereas (A(o, b) — > (a, b)) J_ = J_. 

The function type is also lifted; we can actually distinguish between J_::a — + a 
and Xx — ■> J_ :: a — ■> a by using seq, a function with the following semantics [12]: 
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seq :: a — > b — > b 
seq _L b = _L 
seq a b = b 



(Here a is any value except for _L.) In other words ^-conversion is not valid for 
Haskell functions, so to verify that two functions are equal it is not enough to 
verify that they produce identical output when applied to identical input; we also 
have to verify that none (or both) of the functions are _L. A consequence of the 
lack of ry-conversion is that one of the monadic identity laws fails to hold for some 
standard Monad instances in Haskell, such as the state “monad”. The existence 
of a polymorphic seq also weakens Haskell’s parametricity properties [8], but 
that does not directly affect us because our functions are not polymorphic. 

Another caveat, also related to seq, is that / = A True x — > x is not identical 
to f = A True — > \x — > x. By careful inspection of Haskell’s pattern matching 
semantics [12] we can see that f False = Xx — > _L while f False = _L, since the 
function / is interpreted as 

Xa — » A b — » case (a, b) of 
{True, x) — > x 

whereas the function f is interpreted as 

A a — > case a of 

True — » Xx — » x . 

This also applies if / and f are defined by / True x = x and /' True = Xx — » x. 
We do not get any problems if the first pattern is a simple variable, though. We 
will avoid problems related to this issue by never pattern matching on anything 
but the last variable in a multiple parameter function definition. 

4 Properties: First Try 

The programs in Sect. 2 are simple enough. Are they correct? That depends on 
what we demand of them. Let us say that we want them to form an embedding- 
projection pair, i.e. 

parse o pretty = id :: (T, String) — > (T, String) (1) 

and 

pretty o parse C id :: String — > String. (2) 

The operator C denotes the ordering of the semantical domain, and = is seman- 
tical equality. 

More concretely (1) means that for all pairs p :: (T, String) we must have 
parse ( pretty p) = p. (Note that ?y-conversion is valid since none of the functions 
involved are equal to _L; they both expect at least one argument.) The quantifi- 
cation is over all pairs of the proper type, including infinite and partial values. 
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If we can prove this equality, then we are free to exchange the left and right 
hand sides in any well- typed context. This means that we can use the result 
very easily, but we have to pay a price in the complexity of the proof. In this 
section we “cheat” by only quantifying over finite, total trees so that we can use 
simple structural induction. We return to the full quantification in later sections. 

Parse after Pretty. Let us prove (1) for finite, total trees and arbitrary strings, 
just to illustrate what this kind of proof usually looks like. First we observe that 
both sides are distinct from _L, and then we continue using structural induction. 
The inductive hypothesis used is 

Vcs :: String . ( parse o pretty ) (f, cs ) = id ( t , cs), 

where t :: T is any immediate subtree of the tree treated in the current case. 
We have two cases, for the two constructors of T . The first case is easy (for an 
arbitrary cs :: String ): 

(parse o pretty) ( L , cs) 

= M 

parse ( pretty ( L , cs)) 

— {pretty} 

parse ( pretty ' L -H- cs) 

= {pretty'} 
parse ("L" -H- cs) 

= w 

parse (’L’ : cs) 

= {parse} 

(L, cs) 



The second case requires somewhat more work, but is still straightforward. 
(The use of where here is not syntactically correct, but is used for stylistic 
reasons. Just think of it as a postfix let.) 



( parse o pretty) (B l r, cs) 

= (o, pretty} 

parse ( pretty ' (B l r) - H- cs) 

= {pretty' , -H- associative, -H-} 
parse (’B’ : pretty' l -H- pretty' r -H- cs) 

= {parse} 

(B l ' r', cs") 

where (/', cs') = parse ( pretty 1 l -H- pretty' r -H- cs) 
(r', cs") = parse cs' 

= { pretty , o} 

(B l ' r', cs") 

where (/', cs') = ( parse o pretty) (l, pretty' r -H- cs) 
(r', cs") = parse cs' 

= (Inductive hypothesis} 
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(B V r', cs ") 

where (/', cs') = id (l, pretty' r 4f cs) 

(r , cs ) = parse cs 
= {id, where} 

(B l r', cs") 

where (V, cs") = parse ( pretty ' r 4f cs) 

= {pretty, o} 

(B l r', cs") 

where (r', cs") = ( parse o pretty) (r, cs) 

= {Inductive hypothesis) 

(B l r' , cs") 

where (r', cs") = id (r, cs) 

= {id, where) 

{B l r, cs) 

Hence we have proved using structural induction that ( parse o pretty) (t, cs) 
= {t, cs) for all finite, total t :: T and for all cs :: String. Thus we can draw the 
conclusion that (1) is satisfied for that kind of input. 

Pretty after Parse. We can show that (2) is satisfied in a similar way, using the 
fact that all Haskell functions are continuous and hence monotone with respect 
to C. In fact, the proof works for arbitrary partial, finite input. We show the 
case for cs :: String, head cs = ’B’, i.e. cs = ’B’ : csi for some (partial and 
finite) csi :: String : 

( pretty o parse) (’B’ : csi) 

= {o, parse} 
pretty ( B l r, cs") 

where {l, cs)) = parse cs i 
(r, cs " ) = parse cs} 

= {pretty, pretty' , -H- associative} 

"B" 4f pretty' l 4f pretty' r -H- cs" 
where {l, cs}) = parse cs i 
(r, cs'/) = parse cs } 

= {pretty} 

"B" 4f pretty' l 4f pretty (r, cs") 
where (l, cs}) = parse cs i 
(r, cs") = parse cs } 

= {where, pretty J_ = pretty (_L, _L), o) 

"B" -H- pretty ' l -H- {pretty o parse) cs} 
where {l, cs}) = parse cs i 
C {Inductive hypothesis, monotonicity} 

"B" 4f pretty' l 4f id cs} 
where {l, cs}) = parse cs i 
= {id, pretty, where, pretty _L = pretty (_L,_L), o} 

"B" -H- {pretty o parse) cs i 
C {Inductive hypothesis, monotonicity} 

"B" 4f id csi 
= {id, -H-} 

’B’ : csi 

The other cases {head cs ^ {’L’, 'B ; } and head cs = ’L’) are both straightfor- 
ward. 
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Parse after Pretty, Revisited. If we try to allow partial input in (1) instead 
of only total input, then we run into problems, as this counterexample shows: 

( parse o pretty) (_L, cs) 

= {o, pretty} 
parse { pretty ' _L 4f cs) 

= {pretty', 4f} 
parse _L 
= {parse} 

_L :: {T, String) 

^ {(, ) is not strict} 

(_L, cs) :: (T, String) 



We summarise our results so far in a table; we have proved (2) for finite, 
partial input and (1) for finite, total input. We have also disproved (1) in the 
case of partial input. The case marked with ? is treated in Sect. 5 below. 





Total 


Partial 


Finite 


(2), (1) 


(2), - (1) 


Infinite 


? 


-(1) 



Hence the programs are not correct if we take (1) and (2) plus the type signatures 
of pretty and parse as our specification. Instead of refining the programs to meet 
this specification we will try to refine the specification. This approach is in line 
with our goal from Sect. 1: To prove properties of programs, without changing 
them. 

5 Tests 

As seen above we have to refine our properties, at least (1). To aid us in finding 
properties which are valid for partial and infinite input we will test the properties 
before we try to prove them. 

How do we test infinite input in finite time? An approach which seems to 
work fine is to use the approximation lemma [6]. For T the function approx is 
defined as follows {Nat is a data type for natural numbers): 

data Nat = Zero \ Succ Nat 
approx :: Nat — > T — ► T 
approx ( Succ n) = \t —+ case t of 
L — > L 

B l r — > B {approx n l) {approx n r) 

Note that approx Zero is undefined, i.e. _L. Hence approx n t traverses n levels 
down into the tree t and replaces everything there by _L. 

For the special case of trees the approximation lemma states that, for any 
1 1 , <2 : : T , 

t\ = iff Vn £ Nat a n . approx n fa = approx n fa. (3) 
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Here Nata n stands for the total and finite values of type Nat, i.e. Nata n corre- 
sponds directly to N. If we want to test that two expressions yielding possibly 
infinite trees are equal then we can use the right hand side of this equivalence. 
Of course we cannot test the equality for all n , but if it is not valid then running 
the test for small values of n should often be enough to find a counterexample. 

Testing equality between lists using take :: Int — * [a] — * [a] and the take 
lemma, an analogue to the approximation lemma, is relatively common. However, 
the former does not generalise as easily to other data types as the latter does. The 
approximation lemma generalises to any type which can be defined as the least 
fixpoint of a locally continuous functor [6] . This includes not only all polynomial 
types, but also much more, like nested and exponential types. 

Using the approximation lemma we have now reduced testing of infinite val- 
ues to testing of partial values. Thus even if we were dealing with total values 
only, we would still need to include _L in our tests. Generating the value _L is 
easily accomplished: 

-L :: a 

j_ = error I 

(Note that the same notation is used for the expression that generates a 1 as 
for the value itself.) 

The tricky part is testing for equality. If we do not want to use a separate 
tool then we necessarily have to use some impure extension, e.g. exception han- 
dling [11]. Furthermore it would be nice if we could perform these tests in pure 
code, such as QuickCheck [2] properties (see below). This can only be accom- 
plished by using the decidedly unsafe function unsafePerformlO v. 10 a — » a 
[1,11]. The resulting function isBottom :: a — > Bool 1 has to be used with care; 
it only detects a _L that results in an exception. However, that is enough for 
our purposes, since pattern match failures, error " . . . " and undefined all raise 
exceptions. If isBottom x terminates properly, then we can be certain that the 
answer produced ( True or False) is correct. 

Using isBottom we define a function that compares two arbitrary finite trees 
for equality: 

(=) :: T-> T — > Bool 

t± = <2 = case ( isBottom ti, isBottom fa) of 
(True, True) — > True 
(False, False) — > case (t\, fa) of 
(L, L) — » True 

(B l r, B l' r 1 ) —> l = V A r = r 1 
_ — > False 

_ — » False 



1 The function isBottom used here is a slight variation on the version implemented by 
Andy Gill in the libraries shipped with the GHC Haskell compiler. We have to take 
care not to catch e.g. stack overflow exceptions, as these may or may not correspond 
to bottoms. 
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Similarly we can define a function (£) :: T — > T — > Bool which implements 
an approximation of the semantical domain ordering (C). The functions approx, 
(=) and (t) are prime candidates for generalisation. We have implemented them 
using type classes; instances are generated automatically using the “Scrap Your 
Boilerplate” approach to generic programming [9] . 

QuickCheck is a library for defining and testing properties of Haskell func- 
tions [2] . By using the framework developed above we can now give QuickCheck 
implementations of properties (1) and (2): 

prop 1 n = for All pair (Xp — > 

approxPair n ((parse o pretty) p) = approxPair n (id p)) 
prop 2 n = for All string (Xcs — > 

approx n ((pretty o parse) cs) C approx n (id cs)) 
approxPair n (t, cs) = (approx n t, approx (2 ' n) cs) 

These properties can be read more or less as ordinary set theoretic predicates, 
e.g. for prop x “for all pairs p the equality .. . holds”. The generators pair and 
string (defined in Appendix A) ensure that many different finite and infinite 
partial values are used for p and cs in the tests. Some values are never generated, 
though; see the end of this section. 

If we run these tests then we see that prop ± fails almost immediately, whereas 
prop 2 succeeds all the time. In other words (1) is not satisfied (which we already 
knew, see Sect. 4), but on the other hand we can be relatively certain that (2) 
is valid. 

You might be interested in knowing whether (1) holds for total infinite input, 
a case which we have neglected above. We can easily write a test for such a case: 

infiniteTree = B infiniteTree L 
propInfiniteTotal n = 

approxPair n ((parse o pretty) p) = approxPair n (id p) 
where p = (infiniteTree , "") 

(The value infiniteTree is a left-infinite tree.) When executing this test we run 
into trouble, though; the test does not terminate for any n € N at f\ n . The reason is 
that the left-hand side does not terminate, and no part of the second component 
of the output pair is ever created (i.e. it is _L). This can be seen by unfolding the 
expression a few steps: 

approxPair n ((parse o pretty) (infiniteTree , "")) 

= {Unfold, rearrange slightly} 

(approx n (B l r), approx (2 * n) cs") 
where (l, cs') = (parse o pretty) (infiniteTree, "L") 

(r, cs") = parse cs' 

One of the subexpressions is (parse o pretty) (infiniteTree, "L"), which is essen- 
tially the same expression as the one that we started out with, and cs" will not 
be generated until that subexpression has produced any output in its second 
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Fig. 1. With the left tree called t, the right tree is t' = ( fst o parse o pretty') t. 



component. The right-hand side does terminate, though, so (1) is not valid for 
total, infinite input. 

Since prop 1 does not terminate for total, infinite trees we have designed our 
QuickCheck generators so that they do not generate such values. This is of course 
a slight drawback. 

6 Properties: Second Try 

As noted above (1) is not valid in general. If we inspect what happens when 
fst o parse o pretty ' is applied to a partial tree, then we see that as soon as a 1 
is encountered all nodes encountered later in a preorder traversal of the tree are 
replaced by _L (see Fig. 1). 

We can easily verify that the example in the figure is correct (assuming that 
the part represented by the vertical dots is a left-infinite total tree): 

t = B (B (B A. infiniteTree ) L) ( B L L) 

t' = b {b {b L ) . ) 

propFigure = t' = ( fst o parse o pretty 1 ) t 



Evaluating propFigure yields True, as expected. 

Given this background it is not hard to see that (snd o parse o pretty) ( t , cs) = 
_L whenever the tree t is not total. Furthermore ( parse o pretty) (t, cs) = _L iff 
t = -L. Using the preceding results we can write a replacement strictify for id 
that makes 



parse o pretty = strictify :: (T, String) — ■> (T, String) (1’) 

a valid refinement of (1) (as we will see below): 

strictify :: (T, a) — » (T, a) 
strictify (t, a) = t l seq l (t 1 , tTotal ‘ seq ‘ a) 
where ( t tTotal ) = strictify’ t 
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If t = _L then _L should be returned, hence the first seq. The helper function 
strictify' , which does the main trunk of the work, returns the strictified tree in its 
first component. The second component, which is threaded bottom-up through 
the computation, is () whenever the input tree is total, and _L otherwise; hence 
the second seq. In effect we use the Haskell type () as a boolean type with _L 
as falsity and () as truth. It is the threading of this “boolean”, in conjunction 
with the sequential nature of seq, which enforces the preorder traversal and 
strictification indicated in the figure above: 

strictify 1 :: T —> (T , ()) 
strictify 1 L = (L, ()) 

strictify 1 (B l r) = ( B l' ( ITotal l seq l r'), ITotal l seq l rTotal) 
where ( l ' , ITotal) = strictify' l 
(r' , rTotal) = strictify' r 

Note that if the left subtree l is not total, then the right subtree r should be 
replaced by _L; hence the use of ITotal ‘seq 1 r' above. The second component 
should be () iff both subtrees are total, so we use seq as logical and between 
ITotal and rTotal ; a ‘seq‘ b = () iff a = () and b = () for a, b :: (). 

Before we go on to prove (1’), let us test it: 

propi n = forAll pair (A p — > 

approxPair n ((parse o pretty) p) = approxPair n (strictify p)) 

This test seems to succeed all the time - a good indication that we are on the 
right track. 

7 Proofs Using Fixpoint Induction 

Now we will prove (1’) and (2) using two different methods, fixpoint induction 
(in this section) and the approximation lemma (in Sect. 8). All details will not 
be presented, since that would take up too much space. 

In this section let ip, ipi etc. stand for arbitrary types. 

To be able to use fixpoint induction [4, 14] all recursive functions have to be 
defined using fix, which is defined by 

OO 

fixf = □ f± (4) 

i= 0 

for any continuous function f :: ip —> ip. (The notation /' stands for / composed 
with itself i times.) It is easy to implement fix in Haskell, but proving that the 
two definitions are equivalent would take up too much space, and is omitted: 

fix :: (a — > a) — > a 

fa f =f (. fix f) 
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Let P be a chain-complete predicate, i.e. a predicate which is true for the 
least upper bound of a chain whenever it is true for all the elements in the chain. 
In other words, if P(/*_L) is true for all i £ N and some / :: ip — *■ ip, then we 
know that P(fix f) is true (we only consider cc-chains) . Generalising we get the 
following inference rule from ordinary induction over natural numbers (and some 
simple domain theory): 

Vne N. (P(/fT,/ 2 "T,...,/"T)^ 

P(±, -L, . . . , -L) P(A n+1 -L, / 2 " +1 ±, ■ ■ ■ , /” +1 T)) 

P(fix fi,--- ,fi% fm ) (5) 

Here m g N and the /, are continuous functions fi" ipi — > ipi- We also have 
the following useful variant which follows immediately from the previous one, 
assuming that the ipi are function types, ipi = ip) — » ip " , and that all /, are 
strictness-preserving, i.e. if gi is strict then fi gi should be strict as well. 

V strict gi :: ipi,g 2 " ipi 5 • • • > 9m •• l^m • 

P(-l L, ■ . . , _L) P(ffl, g 2 , • • ■ , ffm) => P(/l gi ,/2 ff 2 5 • • ■ i fm gm) 

P{fix fl,fix fi,--. ,fix fm) (6) 

That is all the theory that we need for now; on to the proofs. Let us begin 
by defining variants of our recursive functions using fix: 

pretty^. :: T -> String 

vretty' fix = fix pretty step 

pretty step :: (T — * String) — > T — > String 

pretty step p L = "L" 

pretty step p (B l r) = "B" -H- p l 4f p r 

parsefl x :: String — > (T, String) 

P arse fix = Parse step 

parse step :: ( String — > (T, String)) — > String — ■> (T, String) 

Parse step p' (’L ’ : cs) = (L, cs) 

P arse s t ep p 1 i’B’ : cs) = (B l r, cs") 
where (Z, cs') = p' cs 
(r, cs") = p' cs' 



strictify' fix :: T —> (T, ()) 

strictify' fix = fix strictify step 

strictify step :: (T —> (T , ())) — > T —> (T,Q) 

strictify step s L = (L, ()) 

strictify step s (B l r) = (B l 1 ( ITotal l seq‘ r'), ITotal i seq‘ rTotal) 
where (l 1 , ITotal) = s l 
(r 1 , rTotal) = s r 
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Of course using these definitions instead of the original ones implies a proof 
obligation; we have to show that the two sets of definitions are equivalent to 
each other. In a standard domain theoretic setting this would follow immediately 
from the interpretation of a recursively defined function. In the case of Haskell 
this requires some work, though. The proofs are certainly possible to perform, 
but they would lead us too far astray, so we omit them here. 

The properties have to be unrolled to fit the requirements of the inference 
rules. To make the properties more readable we define new versions of some other 
functions as well: 

pretty fi x :: (T — > String) — » (T, String) — * String 
pretty^ p (t, cs) = p t -H- cs 
strictify fix :: (T —> (T, ())) -> (T, a) -> (T, a) 
strictifyfl x s (f, a) = t ‘seq‘ (t\ tTotal l seq l a) 
where ( t ' , tTotal) = s t 

We end up with 

Pi(p,p',s)= (7) 

p' o pretty fix p = strictify fix s 

and 



P2(p,p')= (8) 

pretty fix pop' C id. 

However, we cannot use Pi as it stands since P[ (J L, _L) is not true. To see this, 

pick an arbitrary cs :: String and a t :: T satisfying t yf _L: 

(T o pretty fix _L) (t, cs) 

= - 1 } 

.1 :: (T, String) 

^ {seq, t ^ T, (, ) is not strict} 
t ‘seq‘ (_L, _L) :: ( T, String) 

= {seq} 

t ‘seq‘ (_L, T ‘seq‘ cs) :: ( T, String) 

= {where, pattern matching} 
t l seq‘ (t ' , tTotal l seq‘ cs) :: ( T , String) 
where {t' , tTotal) = _L 
- { } 

t ‘ seq ‘ {t' , tTotal ‘seq‘ cs) :: ( T, String) 
where [t' , tTotal) = _L t 
= {strictify fix } 
strictify fix _L (t, cs) 

We can still go on by noticing that we are only interested in the property in 
the limit and redefining it as 

P[{p,p',s) = Pi (pretty step p,parse step p ' , strictify step s), 



(7’) 
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i.e. P[(p,p',s) is equivalent to 

P arse step P’ °Pretty fix ( pretty step p) = strictify fix ( strictify step s ). (9) 

With P[ we avoid the troublesome base case since P[ (J L, A) is equivalent to 

Pi(pretty step ±,parse step A, strictify step _L). 

Now it is straightforward to verify that P{(A, A, A) and P 2 (_L,_L) are valid 
(P{ requires a tedious but straightforward case analysis). It is also easy to 
verify that the predicates are chain-complete using general results from do- 
main theory [14]. As we have already stated above, verifying formally that 
P[(fix pretty step , fix parse step , fix strictify step ) is equivalent to (1’) and simi- 
larly that P 2 (fix pretty step , fix parse step ) is equivalent to (2) requires more work 
and is omitted. 

Pretty after Parse. Now on to the main work. Let us begin with P 2 . Since 
we do not need the tighter inductive hypothesis of inference rule (5) we will use 
inference rule (6); it is easy to verify that pretty step and parse step are strictness- 
preserving. Assume now that P 2 (p,p') is valid for strict p :: T — > String and 
p' :: String — > ( T , String). We have to show that P 2 (pretty step p,parse step p') is 
valid. After noting that both sides of the inequality are distinct from A, take an 
arbitrary element cs :: String. The proof is a case analysis on head cs. 

First case, head cs fL {’L’, ’ B ’ } : 

(pretty fix (pretty step p) o ( parse step p’)) cs 
= {o, parse step , head cs £ {’L’, ’B’}} 

Prefix (P re tty atep p) -L 
= {pretty fix ) 

A :: String 

C {A is the least element in the domain} 
id cs 

Second case, head cs = ’ L’, i.e. cs = ’ L ’ : cs 1 for some csi :: String: 

(pretty fix ( pretty step p) o ( parse step p')) cs 
= {o, parse step , cs = >L’ : csi} 

Pretty fix ( pretty step p) (L, csi) 

= {pretty fix } 
pretty atep p L -H- csi 
= { prettystep > -A, id} 
id cs 

Last case, head cs = ’B’, i.e. cs = ’B’ : csi for some cs 1 :: String: 

(pretty fix (pretty st e P p) ° ( parse step p')) cs 
= {°. Parse st e P , cs = ’B> : csi} 

Pretty fix ( pretty s t ep p) (B l r, cs") 
where (l, cs}) = p' cs 1 
(r, cs") = p' cs} 
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= {pretty fix , pretty step , -H- associative} 

"B" -H- p l 4f (p r -H- cs") 
where (l, cs}) = p' cs 1 
(r, cs") = p' cs} 

= {pretty fix } 

"B" -H- p l -H- pretty fix p (r, cs") 
where (l, cs}) = p' csi 
( r , cs") = p' cs} 

= {where, p strict implies that pretty fix p _L = pretty fix p (_L, _L), o} 
"B" -H- p l 4f (pretty fix p o p') cs} 
where (l, cs}) = p' csi 
C {Inductive hypothesis, monotonicity} 

"B" 4f p l -H- id cs} 
where (l, cs}) = p' csi 
= {id, pretty where, p strict, o} 

"B" -H- (pretty fix pop') csi 
C {Inductive hypothesis, monotonicity} 

"B" 4f id csi 
= {id, -H-, id} 
id cs 



This concludes the proof for P 2 . 



Parse after Pretty. For P[ we will also use inference rule (6); in addition 
to pretty step and parse step it is easy to verify that strictify step is strictness- 
preserving. Assume that P[(po,p'o, so) is valid, where po, p' 0 and so are all strict. 
We have to prove that P{(pretty step po, parse step p' 0 , strictify step so) is valid. 
This is equivalent to proving Pi(pretty step p, parse step p' , strictify step s ), where 
p = pretty step po, p' = parse step pj) and s = strictify step sq. The first step of 
this proof is to note that both sides of the equality in P[ are distinct from _L. 
The rest of the proof is, as before, performed using case analysis, this time on 
pair, an arbitrary element in ( T,cs ). The cases pair = _L, pair = (_L, cs) and 
pair = ( L , cs) for an arbitrary cs :: String are straightforward and omitted. 

Last case, pair = (B l r, cs) for arbitrary subtrees l,r :: T and an arbitrary 
cs :: String: 

(parse step p' o pretty fix (pretty 8tep p)) (B l r, cs) 

= {o, pretty fix } 

P arse ste P P' (Pretty step p (B l r) 4f cs) 

= {pretty step , 4f, -H- associative} 
parse step p' (’B’ : p l 4f (p r -H- cs)) 

= { Parse step } 

(B l' r', cs") 

where (/', cs') = p' (p l -H- (p r -H- cs)) 

(r' , cs") = p' cs' 

= {pretty fix , 0 } 
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(B V r' , cs”) 

where (/', cs') = ( p ' o pretty fix p) ( l,p r -H- cs) 
(V , cs”) = p' cs' 



= {Inductive hypothesis} 

{B l' r', cs”) 

where (/', cs 1 ) = strictify fix s (l, p r 4f cs) 
(r', cs”) = p' cs' 

= {strictify fix } 



(B l ' r\ cs”) 

where (/', cs') = l ‘seq‘ (t' , tTotal ‘ seq‘ p r ^ cs) 

(t 1 , tTotal) ~sl 
( r' , cs ") m p' cs' 

= {Simple case analysis on l (_L or not _L), pattern matching} 
(B l' r', cs”) 

where (l' , cs') = ( l l seq L t' , l ‘ seq 1 tTotal ‘seq 1 p r - H- cs) 
(t' , tTotal) = s l 
( r',cs ") = p' cs' 

= {seq, if l = _L then t' = tTotal = _L since s is strict} 

(B l' r', cs”) 

where (/', cs') = (t' , tTotal l seq‘ p r 44- cs) 

(t' , tTotal) = s l 
\r',cs”) = p' cs' 

= {where} 

{B t’ r', cs”) 
where (t' , tTotal) = s l 

(r', cs”) = p' ( tTotal ‘ seq 1 p r -H- cs) 

= {Rename variables} 

(B l' r', cs”) 
where (/', ITotal) = s l 

(r', cs”) = p' ( ITotal l seq‘ p r -H- cs) 



The rest of the proof is straightforward. Using case analysis on ITotal we 
prove that 

(B l' r', cs”) 

where (V , cs”) = p' {ITotal ‘seq 1 p r 4f cs) 

( B l ' {ITotal ‘seq 1 r'), ITotal ‘seq‘ rTotal ‘seq‘ cs) 
where (V, rTotal) = s r 



is valid. In one branch one can observe that p' is strict. In the other the inductive 
hypothesis can be applied, followed by reasoning analogous to the one for l ‘ seq‘ 
above. Given this equality the rest of the proof is easy. Hence we can draw the 
conclusion that P\{pretty step p, parse step p ' , strictify step s) is valid, which means 
that we have finished the proof. 
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8 Proofs Using the Approximation Lemma 

Let us now turn to the approximation lemma. This lemma was presented above 
in Sect. 5, but we still have a little work to do before we can go to the proofs. 



Pretty after Parse. Any naive attempt to prove (2) using the obvious inductive 
hypothesis fails. Using the following less obvious reformulated property does the 
trick, though: 



Vm € N . pp m C id :: String — > String. (10) 

Here we use a family of helper functions pp m ( in G N): 

pp m cs = pretty' t\ 4f pretty' t 2 4f . . . -H- pretty' t m 4f cs m 
where (t\, cs i) = parse cs 
(t 2 , cs 2 ) = parse cs 1 

( tmi = parse cs m —i 



(We interpret pp 0 as id.) It is straightforward to verify that this property is 
equivalent to (2). 

Note that we cannot use the approximation lemma directly as it stands, 
since the lemma deals with equalities, not inequalities. However, replacing each 
= with C in the proof of the approximation lemma in Gibbons’ and Hutton’s 
article [4, Sect. 4] is enough to verify this variant. We get that, for all m £ N 
and cs :: String , 



cs C id cs iff , „ 

m ~ ( 11 ) 

Vn £ Natu n • approx n (pp m cs) U approx n ( id cs). 

Hence all that we need to do is to prove the last statement above (after noticing 
that both pp m and id are distinct from _L, for all to € N). We do that by 

induction over n, after observing that we can change the order of the universal 

quantifiers so that we get 

\/n € Natfi n . Vto e N . Vcs :: String . 

approx n (pp m cs) C approx n (id cs), (12) 

which is equivalent to the inequalities above. 

For lists we have the following variant of approx: 

approx :: Nat [a] — » [a] 
approx ( Succ n) = \(x : xs) 



x : approx n xs 
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Since approx Zero is undefined the statement (12) is trivially true for n = Zero. 
Assume now that Vm £ N . Vcs :: String . approx n ( pp m cs) C approx n ( id cs ) 
is true for some n £ Nata n . Take an arbitrary m £ N. Note that the property 
that we want to prove is trivially true for m = 0, so assume that m > 1. We 
proceed by case analysis on head cs. 

First case, head cs {’L’, ’B’}: 

approx ( Succ n) ( pp m cs) 

= {parse, where, pretty' , 44 - } 
approx ( Succ n) _L 

C {_L is the least element, monotonicity} 
approx ( Succ n) (id cs) 



Second case, head cs = ’ L’, i.e. cs = ’L’ : cs' for some cs' :: String : 

approx ( Succ n) ( pp m (’L’ : cs')) 

= { PPm , ™ > !} 

approx ( Succ n) ( pretty ' £i 4 f pretty' £2 44 - ... 44 - pretty' tm -H- cs m ) 
where (ti, csi) = parse (’L* : cs') 

(t2, CS2) = parse csi 

(tm, cs m ) = parse cs m - 1 

= {parse, where, note that if m — 1 then cs m = cs'{ 
approx (Succ n) (pretty' L 4 f pretty' fo -H- . . . -H- pretty' tm -H- cs m ) 
where (£2, cs 2) = parse cs' 

(tm, csm) = parse cs m - 1 
= {pretty 1 , -H~} 

approx (Succ n) (’L’ : pretty' £2 -H- • • • -H- pretty' tm 4 f cs m ) 
where (£2, CS2) = parse cs' 

(tm, csm) = parse cs m - 1 
= { approx } 

’L’ : approx n (pretty' £2 -H- . . . -H- pretty' tm -H- cs m ) 
where (£2, cs 2) = parse cs' 

(tm, cSm) = parse cs m - 1 
= {PPm-l, m > 1 } 

’L’ : approx n (pp m -i cs ') 

C {Inductive hypothesis, monotonicity} 

’ L’ : approx n (id cs') 

= {id, approx} 
approx (Succ n) (’L’ : cs') 
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Last case, head cs = ’ B’, i.e. cs = ’B’ : cs' for some cs ' :: String: 

approx ( Succ n) ( pp m (’B’ : cs 1 )) 

= { PPm > ™> 1} 

approx ( Succ n ) ( pretty ' ti 4f pretty' fo -H- . . . -H- pretty 1 tm 4f cs m ) 
where (t\, csi) = parse (’B’ : cs') 

(t 2 , CS 2 ) = parse cs 1 

(tm, cs m ) = parse cs m - 1 
= {parse, where} 

approx ( Succ n) ( pretty ’ (B l r) - H- pretty' fc 4f . . . -H- pretty' tm 4f cs m ) 
where (l, Is) = parse cs' 

(r, rs) = parse Is 

(h, CS 2 ) = parse rs 

(tm, csm) = parse cs m - 1 
= {pretty' , 4f, -H- associative} 
approx ( Succ n) 

( ’B ’ : pretty' l pretty’ r -H- pretty' -H- ■ • ■ 4f pretty' t m 4f cs m ) 
where (l, Is) = parse cs' 

(r, rs) = parse Is 

(t 2 , CS 2 ) = parse rs 

(tm, csm) = parse cs m - 1 
= {approx} 

’B ’ : approx n ( pretty 1 l 4f pretty' r 4f pretty' h -H- • ■ ■ -H- pretty' tm cs m ) 
where (l, Is) = parse cs' 

(r, rs) = parse Is 

(h, CS 2 ) = parse rs 

(tm, csm) = parse cs m - 1 

= {PPm+ 1} 

’B’ : approx n (pp m +i cs ') 

C {Inductive hypothesis, monotonicity} 

’B’ : approx n (id cs') 

— {id, approx} 
approx (Succ n) (’B’ : cs') 



Hence we have yet again proved (2), this time using the approximation lemma. 



Parse after Pretty. Let us now turn to (1’). We want to verify that parse o 
pretty = strictify :: ( T , String ) (T, String) holds. This can be done using the 

approximation lemma as given in equivalence (3). To ease the presentation we 
will use the following helper function: 
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approxP :: Nat — ■> (T , a) — » ( T,a ) 
approxP n it, a) = ( approx n t, a) 

Using this function we can formulate the approximation lemma as 

pi = p2 iff Vn € Naif , n ■ approxP n pi = approxP n p2 (13) 

for arbitrary pairs pl,p2 :: (T,ip), where if is an arbitrary type. In our case 
if) = String , pi = ( parse o pretty ) p and p2 = strictify p for an arbitrary pair 
p :: (T, String). 

The proof proceeds by induction over n as usual; and as usual we first have 
to observe that parse o pretty and strictify are both distinct from _L. The case 
n = Zero is trivial. Now assume that we have proved approxP n ((parse o 
pretty) p) = approxP n (strictify p) for some n € Nata n and all p :: (T, String). 
(All p since we can change the order of the universal quantifiers like we did to 
arrive at inequality (12).) We prove the corresponding statement for Succ n by 
case analysis on p. All cases except for the one where p = (B l r, cs) for arbitrary 
subtrees l,r :: T and an arbitrary cs :: String are straightforward and omitted, 
so we go directly to the last case: 

approxP (Succ n) ((parse o pretty) (B l r, cs)) 

= {o, pretty, pretty' , 4f, -H- associative} 
approxP (Succ n) (parse (’B’ : pretty' l -H- pretty' r 4f cs)) 

= {parse, pretty, o} 
approxP ( Succ n) ( B l' r' , cs") 
where (l' , cs') = ( parse o pretty) (l, pretty' r -H- cs) 

(r , cs ) = parse cs 
= { approxP , approx} 

( B ( approx n l') ( approx n r'), cs") 
where (l' , cs') = ( parse o pretty) (l, pretty' r -H- cs) 

(r' , cs") = parse cs' 

= {Push approx n through the pairs, turning it into approxP n} 

(B l' r', cs") 

where (l' , cs') = approxP n ((parse o pretty) (l, pretty' r -H- cs)) 

(r' , cs") — approxP n ( parse cs') 

= {Inductive hypothesis) 

(B V r', cs") 

where (l', cs') = approxP n ( strictify (l, pretty' r 4f cs)) 

(r' , cs") = approxP n ( parse cs') 

= {strictify} 

( B l' r', cs") 

where (l' , cs') = approxP n (l l seq L (t' , tTotal l seq L pretty' r 4f cs)) 

(t' , tTotal) = strictify' l 

(r' , cs") = approxP n ( parse cs') 

The proof proceeds by case analysis on l. We omit the cases l = _L and l = L 
and go to the last case, l = B If n for arbitrary subtrees l \ , n : : T: 
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(B V r' , cs ") 

where 

{l 1 , cs') = approxP n ( B h n l seq‘ {t' , tTotal 'seq' pretty' r -H- cs)) 

( t tTotal) = strictify 1 ( B h n) 

(r' , cs”) = approxP n ( parse cs') 

= {seq, strictify', where} 

(B l ' r', cs”) 
where 

(l' , cs') = approxP n ( B 1} ( ITotal l seq‘ r{), 

(l Total ‘seq‘ rTotal) ‘seq‘ pretty' r -H- cs) 

(/{, ITotal) = strictify' h 
(r{, rTotal) = strictify' ri 
(V , cs”) = approxP n ( parse cs') 

= {approxP, where} 

( B ( approx n ( B l{ ( ITotal 'seq' r{))) r' , cs”) 

where 

( 1} , ITotal) = strictify' l\ 

( r { , rTotal) = strictify' ri 

(r' , cs”) = approxP n ( parse (( ITotal l seq‘ rTotal) l seq‘ pretty' r 4f cs)) 



Now we have two cases, depending on whether ITotal l seq l rTotal, i.e. snd 
{strictify' li) l seq‘ snd {strictify' r\), equals _L or not. We omit the case where 
the equality holds and concentrate on the case where ITotal 1 seq 1 rTotal = () ^ _L: 

{B {approx n {B l{ {ITotal 'seq 1 r}))) r' , cs”) 
where (/}, ITotal) = strictify' li 
{r[, rTotal) = strictify' ri 

(V, cs”) = approxP n {parse (() 'seq' pretty' r -H- cs)) 

= {seq, pretty, o} 

( B {approx n {B l[ {ITotal 'seq' r{))) r' , cs") 
where (/}, ITotal) = strictify' h 
{r{, rTotal) = strictify' n 

{r' , cs") = approxP n {{parse o pretty) {r, cs)) 

= {Inductive hypothesis} 

( B {approx n {B l[ {ITotal 'seq' r{))) r' , cs") 
where {l{, ITotal) = strictify' h 
{r[, rTotal) = strictify' ri 
(r', cs”) — approxP n {strictify {r, cs)) 

— {Push approxP n through the pair, turning it into approx n} 

{B {approx n ( B l{ {ITotal 'seq' r{))) {approx n r'), cs") 
where {l{, ITotal) = strictify' h 

(r{, rTotal) = strictify' ri 

(r', cs”) — strictify {r, cs) 

= {approx, approxP} 

approxP {Succ n) {B {B l { {ITotal 'seq' r})) r' , cs") 
where (/{, ITotal) = strictify' h 

(r{, rTotal) = strictify' n 

(V, cs”) — strictify (r, cs) 
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The rest of the proof consists of transforming the expression above to approxP 
( Succ n) ( strictify ( B ( B l\ n) r,cs)). This is relatively straightforward and 
omitted. Thus we have, yet again, proved (1’). 

9 Discussion and Future Work 

In this paper we have investigated how different verification methods can handle 
partial and infinite values in a simple case study about data conversion. We have 
used random testing, fixpoint induction and the approximation lemma. 

Using isBottom and approx for testing in the presence of partial and infinite 
values is not fool proof but works well in practice. The approach is not that 
original; testing using isBottom and take is (indirectly) mentioned already in 
the original QuickCheck paper [2]. However, testing using approx has probably 
not been done before. Furthermore, the functionality of = and £ has not been 
provided by any (widespread) library. 

The two methods used for proving the properties (1’) and (2) have different 
qualities. Fixpoint induction required us to rewrite both the functions and the 
properties. Furthermore one property did not hold for the base case, so it had to 
be rewritten (7’), and proving the base case required some tedious but straight- 
forward work. On the other hand, once the initial work had been completed 
the “actual proofs” were comparatively short. The corresponding “actual proofs” 
were longer when using the approximation lemma. The reason for this is proba- 
bly that the approximation lemma requires that the function approx is “pushed” 
inside the expressions to make it possible to apply the inductive hypothesis. For 
fixpoint induction that is not necessary. For instance, when proving (1’) using the 
approximation lemma we had to go one level further down in the tree when per- 
forming case analysis, than in the corresponding proof using fixpoint induction. 
This was in order to be able to use the inductive hypothesis. 

Nevertheless, the “actual proofs” are not really what is important. They 
mostly consist of performing a case analysis, evaluating both sides of the (in-) 
equality being proved as far as possible and then, if the proof is not finished 
yet, choosing a new expression to perform case analysis on. The most important 
part is really finding the right inductive hypothesis. (Choosing the right expres- 
sion for case analysis is also important, but easier.) Finding the right inductive 
hypothesis was easier when using fixpoint induction than when using the ap- 
proximation lemma. Take the proofs of (2), for instance. When using fixpoint 
induction almost no thought was needed to come up with the inductive hypoth- 
esis, whereas when using the approximation lemma we had to come up with the 
complex hypothesis based on property (10), the one involving pp m . The reason 
was the same as above; approx has to be in the right position. It is of course 
possible that easier proofs exist. 

It is also possible that there are other proof methods which work better 
than the ones used here. Coinduction and fusion, two other methods mentioned 
in Gibbons’ and Hutton’s tutorial [4], might belong to that category. We have 
made some attempts at using unfold fusion. Due to the nature of the programs 
the standard fusion method seems inapplicable, though; a monadic variant is a 
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better fit. The programs can be transformed into monadic variants (which of 
course carries extra proof obligations). We have not yet figured out where to 
go from there, though. For instance, the monadic anamorphism fusion law [10, 
Equation (17)] only applies to a restrictive class of monads, and our “monad” 
does not even satisfy all the monad laws (compare Sect. 3). 

Above we have compared different proof techniques in the case where we 
allow infinite and partial input. Let us now reflect on whether one should consider 
anything but finite, total values. The proofs of (1’) and (2) valid for all inputs 
were considerably longer than the ones for (1) and (2) limited to finite (and in 
one case total) input, especially if one takes into account all work involved in 
rewriting the properties and programs. It is not hard to see why people often 
ignore partial and infinite input; handling it does seem to require nontrivial 
amounts of extra work. 

However, as argued in Sect. 1 we often need to reason about infinite values. 
Furthermore, in reality, bottoms do occur; error is used, cases are left out from 
case expressions, and sometimes functions do not reach a weak head normal form 
even if they are applied to total input (for instance we have reverse [1 . .] = _L). 
Another reason for including partial values is that in our setting of equational 
reasoning it is easier to use a known identity if the identity is valid without 
a precondition stating that the input has to be total. Of course, proving the 
identity without this precondition is only meaningful if the extra work involved 
is less than the accumulated work needed to verify the precondition each time 
the identity is used. This extra work may not amount to very much, though. 
Even if we were to ignore bottoms, we would still sometimes need to handle 
infinite values, so we would have to use methods like those used in this text. In 
this case the marginal cost for also including bottoms would be small. 

Another approach is to settle for approximate results by e.g. assuming that 
Xx — » _L is _L when reasoning about programs. These results would be practically 
useful; we might get some overly conservative results if we happened to evaluate 
seq (Xx — > _L), but nothing worse would happen. On the other hand, many 
of the caveats mentioned in Sect. 3 would vanish. Furthermore most people 
tend to ignore these issues when doing ordinary programming, so in a sense an 
approximate semantics is already in use. The details of an approximate semantics 
for Haskell still need to be worked out, though. We believe that an approach 
like this will make it easier to scale up the methods used in this text to larger 
programs. 
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A QuickCheck Generators 

The QuickCheck generators used in this text are defined as follows: 
tree :: Gen T 

tree = frequency [(6, liftM2 B tree tree), 

(2, return L ), 

(1, return _L)] 

string :: Gen String 

string = frequency [(1, bottomString) , 

(l,finiteString), 

(1, infiniteString) , 

(3, treeString )] 
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where 

bottomString = liftM2 approx arbitrary infiniteString 

finiteString = liftM2 ( take o abs) arbitrary infiniteString 

infiniteString = liftM2 (:) char infiniteString 

treeString = tree >= return o pretty' 

char :: Gen Char 

char = frequency [(10, return 1 B’), 

(10, return 1 L’), 

(1, return 
(1, return _L)] 



pair :: Gen (T, String) 

pair = frequency [(50, UftM2 (,) tree string), 
(1, return _L)] 



A straightforward Arbitrary instance for Nat (yielding only total, finite values) 
is also required. 

The generator tree is defined so that the generated trees have a probability 
of i of being finite, and the finite trees have an expected depth of 2 2 . We do 
not generate any total, infinite trees. The reason is that some of the tests above 
do not terminate for such trees, as shown in Sect. 5. 

To get a good mix of finite and infinite partial strings the string generator 
is split up into four cases. The last case ensures that some strings that actually 
represent trees are also included. It would not be a problem to include total, 
infinite strings, but we do not want to complicate the definitions above too 
much, so they are also omitted. 

Finally the pair generator constructs pairs using tree and string, forcing some 
pairs to be T. 

By using collect we have observed that the actual distributions of generated 
values correspond to our expectations. 



2 Assuming that QuickCheck uses a random number generator that yields independent 
values from a uniform distribution. 




Describing Gen/Kill Static Analysis Techniques 
with Kleene Algebra* 



Therrezinha Fernandes and Jules Deslrarnais 



Departement d’informatique et de genie logiciel 
Universite Laval, Quebec, QC, G1K 7P4 Canada 

{Therrezinha. Fernandes , Jules .Desharnais}@ift . ulaval . ca 



Abstract. Static program analysis consists of compile-time techniques 
for determining properties of programs without actually running them. 
Using Kleene algebra, we formalize four instances of a static data flow 
analysis technique known as gen/kill analysis. This formalization clearly 
reveals the dualities between the four instances; although these dualities 
are known, the standard formalization does not reveal them in such a 
clear and concise manner. We provide two equivalent sets of equations 
characterizing the four analyses for two representations of programs, one 
in which the statements label the nodes of a control flow graph and one 
in which the statements label the transitions. 



1 Introduction 

Static program analysis consists of compile-time techniques for determining 
properties of programs without actually running them. Information gathered 
by these techniques is traditionally used by compilers for optimizing the ob- 
ject code [1] and by CASE tools for software engineering and reengineering [2, 
3]. Among the more recent applications is the detection of malicious code or 
code that might be maliciously exploited [4, 5] . Due to ongoing research in this 
area [5], the latter application is the main motivation for developing the alge- 
braic approach to static analysis described in this paper (but we will not discuss 
applications to security here) . Our goal is the development of an algebraic frame- 
work based on Kleene algebra (KA) [6-11], in which the relevant properties can 
be expressed in a compact and readable way. 

In this paper, we examine four instances of a static data flow analysis tech- 
nique known as gen/kill analysis [1, 12, 13]. The standard description of the four 
instances is given in Sect. 2. The necessary concepts of Kleene algebra are then 
presented in Sect. 3. The four gen/kill analyses are formalized with KA in Sect. 4. 
This formalization clearly reveals the dualities between the four kinds of analysis; 
although these dualities are known, the standard formalization does not reveal 
them in such a clear and concise manner. We provide two equivalent sets of 
equations characterizing the four analyses for two representations of programs, 
one in which the statements label the nodes of a control flow graph and one in 

* This research is supported by NSERC (Natural Sciences and Engineering Research 
Council of Canada). 
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which the statements label the transitions. In the conclusion, we make additional 
comments on the approach and on directions for future research. 



2 Four Different Gen/Kill Analyses 



The programming language we will use is the standard while language, with 
atomic statements skip and x := E (assignment), and compound statements 
Si; S 2 (sequence), if b then Si else S 2 (conditional) and while b do S (while loop). 
In data flow analysis, it is common to use an abstract graph representation of a 
program from which one can extract useful information. Traditionally [1, 12, 13], 
this representation is a control flow graph (CFG) , which is a directed graph where 
each node corresponds to a statement and the edges describe how control might 
flow from one statement to another. Labeled Transition Systems (LTSs) can also 
be used. With LTSs, edges (arcs, arrows) are labeled by the statements of the 
program and nodes are points from which and toward which control leaves and 
returns. Figure 1 shows CFGs and LTSs for the compound statements, and the 
corresponding matrix representations; the CFG for an atomic statement consists 
of a single node while its LTS consists of two nodes linked by an arrow labeled 
with the statement. The numbers at the left of the nodes for the CFGs and inside 
the nodes for the LTSs are labels that also correspond to the lines/columns in 
the matrix representations. Note that the two arrows leaving node 1 in the LTSs 
of the conditional and while loop are both labelled b, i.e., the cases where b holds 
and does not hold are not distinguished. This distinction will not be needed here 
(and it is not present in the CFGs either). For both representations, the nodes 
of the graphs will usually be called program points, or points for short. 




Fig. 1. CFGs and LTSs for compound statements 
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The four instances of gen/kill analysis that we will consider are Reaching 
Definitions Analysis (RD), Live Variables Analysis (LV), Available Expressions 
Analysis (AE) and Very Busy Expressions Analysis (VBE) . An informal descrip- 
tion, extracted from [13], follows. 

RD Reaching definitions analysis determines, for each program point, which as- 
signments may have been made and not overwritten when program execution 
reaches this point along some path. 

A main application of RD is in the construction of direct links between 
statements that produce values and statements that use them. 

LV A variable is live at the exit from a program point if there exists a path , 
from that point to a use of the variable, that does not redefine the variable. 
Live variables analysis determines, for each program point, which variables 
are live at the exit from this point. 

This analysis might be used as the basis for dead code elimination. If a 
variable is not live at the exit from a statement, then, if the statement is an 
assignment to the variable, the statement can be eliminated. 

AE Available expressions analysis determines, for each program point, which 
expressions have already been computed, and not later modified, on all paths 
to the program point. 

This information can be used to avoid the recomputation of an expression. 
VBE An expression is very busy at the exit from a program point if, no matter 
which path is taken from that point, the expression is always evaluated before 
any of the variables occurring in it are redefined. Very busy expressions 
analysis determines, for each program point, which expressions are very busy 
at the exit from the point. 

A possible optimization based on this information is to evaluate the expres- 
sion and store its value for later use. 

Each of the four analyses uses a universal dataset D whose type of elements 
depends on the analysis. This set D contains information about the program 
under consideration, and possibly also information about the environment of 
the program if it appears inside a larger program. Statements generate and kill 
elements from D. Statements are viewed either as producing a subset out C D 
from a subset in C D, or as producing in C D from out C D, depending on the 
direction of the analysis. Calculating in from out (or the converse) is the main 
goal of the analysis. Each analysis is either forward or backward , and is said to 
be either a may or must analysis. This is detailed in the following description. 

RD The set D is a set of definitions. A definition is a pair (x,l), where l is the 
label of an assignment x := E. The assignment x := E at label l generates the 
definition (x, l) and kills all other definitions of x. From the above definition 
of RD, it can be seen that, for each program point, the analysis looks at 
paths between the entry point of the program and that program point; thus, 
the analysis is a forward one. Also, it is a may analysis, since the existence 
of a path with the desired property suffices. 
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LV The set I? is a set of variables. The analysis looks for the existence of a 
path with a specific property between program points and the exit of the 
program. It is thus a backward may analysis. 

AE The set I? is a set of expressions or subexpressions. The paths considered 
are those between the entry point of the program and the program points 
(forward analysis) . Since all paths to a program point must have the desired 
property, it is a must analysis. 

VBE The set I? is a set of expressions or subexpressions. The paths considered 
are those between the program points and the exit point of the program 
(backward analysis). Since all paths from a program point must have the 
desired property, it is a must analysis. 



Table 1. Gen/kill values for atomic statements and tests. The symbol l denotes a label 
and the symbol b a test 





l : x := E 


| sk'P 


; b | 


gen 


kill 


gen 


kill 


gen 


kill 


RD 


{(*.*)} 


{(x,i')eD\V ^1} 


0 


0 


0 


0 


LV 


Var(E) 


{x} — Var(E) 


0 


0 


Var(fe) 


0 


AE 


{E' S Exp(E) | x £ Var(E')} 


{E' e D I x e Var(E')} 


0 


0 


S' 

Q. 

X 

LD 


0 


VBE 


Exp(-E) 


{E' e D \x e Var(E')} - Exp(E) 


0 


0 


Exp(fo) 


0 



Table 1 gives the definitions of gen and kill for the atomic statements and 
tests for the four analyses. Note that for each statement S, gen (5) C kill(S') (the 
complement of kill (S')) for the four analyses. This is a natural property, meaning 
that if something is generated, then it is not killed. In this table, Va r(E) denotes 
the set of variables of expression E and Exp(E) denotes the set of its subexpres- 
sions. These definitions can be extended recursively to the case of compound 
statements (Table 2). The forward/backward duality is apparent when compar- 
ing the values of gen(Si; S 2 ) and kill(Si; S 2 ) for RD and AE with those for LV 
and VBE. The may/must duality between RD, LV and AE, VBE is most visible 
for the conditional (uses of U vs fl in the expressions for gen). 

Finally, Table 3 shows how out(S) and in (S') can be recursively calculated. 
Here too, the dualities forward/backwarcl and may/must are easily seen. 

We now illustrate RD analysis with the program given in Fig. 2. We will use 
this program all along the paper. The numbers at the left of the program are 
labels. These labels are the same in the given CFG representation. 

The set of definitions that appear in the program is {(x, 1), (x, 3), (y, 4)}. 
Assume that this program is embedded in a larger program that contains the 
definitions (x, 5) and (j/,6)} (they may appear before label 1, even if they have 
a larger number as label) and that these definitions reach the entry point of the 
example program. Using Table 1 for RD, we get the following gen/kill values for 
the atomic statements, where Si denotes the atomic statement at label l : 
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Table 2. Gen/kill expressions for compound statements 





1 Si; Si | 


gen 


kill 


RD, AE 


gen(S 2 ) U (gen (Si) - kill(S 2 )) 


kill(S 2 )U(kill(S 1 )-gen(S 2 )) 


LV, VBE 


gen(Si) U (gen(S 2 ) - kill(Si)) 


kill(S 1 )U(kill(S 2 )-ge n (Si)) 





j if b then Si else S 2 \ 


gen 


kill 




gen(Si) U gen(S 2 ) 


kill (Si) n kill (S 2 ) 


LV 


gen(&) U gen(Si) U gen(S 2 ) 


(kill(Si) n kill(S 2 )) — gen(b) 


AE 


(gen(Si) n gen(S 2 )) U (gen(b) — (kill(Si) U kill (S 2 )) 


kill (Si) U kill (S 2 ) 




gen(b) U (gen(Si) n gen(S 2 )) 


(kill(Si) U kill (S 2 )) — gen(6) 





I while b do Si | 


gen 


kill 


RD 


gen(Si) 


0 


LV 


gen(b) U gen (Si) 


0 


AE, VBE 


gen (6) 


kill (Si) - gen(fe) 



Table 3. Linking in and out (“imm,” abbreviates “immediately”) 





in (S') 


out(S) 


RD 


U(S' | S' imm. precedes S : out(S')) 


gen(S) U (in(S) — kill(S)) 


LV 


gen(S) U (out(S) — kill(S)) 


U(S' | S' imm. follows S : in(S')) 


AE 


P| (S' | S' imm. precedes S : out(S')) 


gen(S) U (in(S) — kill (S')) 


VBE 


gen(S) U (out(S) — kill(S)) 


p| (S' | S' imm. follows S : in(S')) 



gen (Si) = {(x,l)}, 
gen(S 2 ) = 0, 
gen(S 3 ) = {(a;, 3)}, 
gen(S 4 ) = {(y, 4)}, 

Using Table 2 for RD, we get 



kill(Si) = {(x,3), (x, 5)}, 
kill(S 2 ) = 0, 

kill(S 3 ) = {(x, 1), (x, 5)}, 
kill(S 4 ) = {(y, 6)}. 



(1) 



gen(Si; if S 2 then S 3 else S 4 ) = 

kill(Si; if S 2 then S 3 else S 4 ) = 



{(x,3),(y,4)}U({(x,l)}-0) 

{(a:,l),(a:,3),(j/,4)}, 

0 U ({(x, 3), (x, 5)} — {(x,3), (j/,4)}) 
{(x,5)}. 



Finally, Table 3 for RD yields 



in(Si; if S 2 then S 3 else S 4 ) = {(x, 5), (y, 6)}, 

out(Si; if S 2 then S 3 else S 4 ) = {(x, 1), (x, 3), (y, 4)} U 

({(s,5),(j/,6)} - {(x, 5)}) 
= {(*> 1), 3), (j/,4), (y, 6)}. 
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1 

2 

3 

4 



x := x x y- 
if x > z 

then x := x — z 
else y := y + z 



3 




Fig. 2. CFG for running example 



3 Kleene Algebra with Tests 

In this section, we first introduce Kleene algebra [6, 9] and a specialization of it, 
namely Kleene algebra with tests [10]. Then, we recall the notion of matrices 
over a Kleene algebra and discuss how we will use them for our application. 

Definition 1. A Kleene algebra (KA) [9] is a structure K, = {K,+, ■ ,*,0,1) 
such that ( K , +, 0) is a commutative monoid, ( K , • , 1) is a monoid, and the 
following laws hold: 

a + a = a , a- (a + b) = a- a + a-b, 

a-0 = 0-a = 0, (a + b) ■ c = a- c+ b- c, 

l + a-a*=a*, b + a-c<c => a*-b<c, 

1 + a*-a = a*, b + c-a<c => b-a*<c, 

where < is the partial order induced by +, that is, 

a < b <t=> a + b = b . 

A Kleene algebra with tests [10] is a two-sorted algebra ( K , T, +, • , *, 0, 1, ->) 
such that ( K , +, • , *, 0, 1) is a Kleene algebra and (T, +, • , 0, 1) is a Boolean 

algebra, where T C K and -> is a unary operator defined only on T. 

Operator precedence, from lowest to highest, is +,-,(*, -i). 

It is immediate from the definition that t < 1 for any test t £ T. The meet of 
two tests t,u £ T is their product t-u. Every KA can be made into a KA with 
tests, by taking {0, 1} as the set of tests. 

Models of KA with tests include algebras of languages over an alphabet, 
algebras of path sets in a directed graph [14], algebras of relations over a set and 
abstract relation algebras with transitive closure [15,16]. 

A very simple model of KA with tests is obtained by taking K to be the 
powerset of some set D and defining, for every a, b C D, 

0 = 0, 1 = D, a* = D, -i a = a, a + b = aUb, a-b = aC\b. (2) 

The set of matrices of size n x n over a KA with tests can itself be turned 
into a KA with tests by defining the following operations. The notation A [i,j] 
refers to the entry in row i and column j of A. 
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1. 0: matrix whose entries are all 0, i.e., 0 [i,j] = 0, 

'1 if i = j 



2. 1: identity matrix (square), i.e., ![*, j] = 



0 if i ± j, 



3. (A + B)[*,j] = A[i, j) +B[i, j], 

4. (A-B)[i, j] = ! : A[*,fc]-B[fc,j]), 

5. The Kleene star of a square matrix is defined recursively [9] . If A = ( a ), for 
some a € K, then A* = (a* ). If 



A = 



a b 
c d 



(with graphic representation 




) , 



for some a, b,c,d £ K, then 



def 



/* f*-b-d* 

d* -c-f* d* + d* -c- f* -b-d* 



(3) 



where f = a + b-d* -c; the automaton corresponding to A helps understand 
that f* corresponds to paths from state 1 to state 1. If A is a larger matrix, 

/ B C 

it is decomposed as a 2 x 2 matrix of submatrices: A = 



D E 



, where B 



and E are square and nonempty. Then A* is calculated recursively using (3). 
For our simple application below, A* = XX n I n — 0 : A"), where A 0 = 1 
and A" +1 = A -A". 



By setting up an appropriate type discipline, one can define heterogeneous 
Kleene algebras as is done for heterogeneous relation algebras [17-19]. One can 
get a heterogeneous KA by considering matrices with different sizes over a KA; 
matrices can be joined or composed only if they satisfy appropriate size con- 
straints. 

In Sect. 4, we will only use matrices whose entries are all tests. Such matrices 
are relations [20]; indeed, a top relation can be defined as the matrix filled with 1. 
However, this is not completely convenient for our purpose. Rather, we will use 
a matrix S to represent the structure of programs and consider only matrices 
below the reflexive transitive closure S* of S. Complementation can then be 
defined as complementation relative to S*: 

(A )[i,j] = ->(A[i, j])-S*[i, j] . 

It is easily checked that applying all the above operations to matrices below S* 
results in a matrix below S*. This means that the KA we will use is Boolean, 
with a top element T satisfying 1 < T (reflexivity) and T-T < T (transitivity). 

As a final remark in this section, we point out that a (square) matrix T is a 
test iff it is a diagonal matrix whose diagonal contains tests (this implies T < 1). 
For instance, if t\ . t -2 and f 3 are tests, 

/fi 0 0 \ /fi 0 0 \ ( o 0 \ 

I 0 f 2 0 J is a test and ^ ( 0 t 2 0 J = | 0 -<t 2 0 J . 

\o o t 3 / V 00 v \ 0 0^3 / 




Describing Gen/Kill Static Analysis Techniques with Kleene Algebra 117 



4 Gen/Kill Analysis with KA 

In order to illustrate the approach, we present in Sect. 4.1 the equations de- 
scribing RD analysis and apply them to the same example as in Sect. 2. The 
equations for the other analyses are presented in Sect. 4.2. 



4.1 RD Analysis 

We first explain how the data and programs to analyze are modelled. Then 
we show how to carry out RD analysis, first by using a CFG-related matrix 
representation and then an LTS-relatecl matrix representation. 

Recall that the set of definitions for the whole program is 

D = {(ah 1) 5 (x, 3), (y, 4), (x, 5), (y, 6)} . (4) 



We consider the powerset of I? as a Kleene algebra, as explained in Sect. 3 
(see (2)). 

The input to the analysis consists of three matrices S , g and k representing 
respectively the structure of the program, what is generated and what is killed 
at each label. Here is an abstract definition of these matrices, where gi C D and 
ki C D, for i = 1, . . . , 4. 



/° 


1 


0 


°\ 


(gi 


0 


0 


°\ 


| 0 


0 


1 


1 1 


0 


52 


0 


0 1 


0 


0 


0 


0 g = 


0 


0 


53 


0 


Vo 


0 


0 


0/ 


V 0 


0 


0 


54/ 



/fci 0 0 0 \ 

f 0 k 2 0 0 1 

0 0 k 3 0 

VO 0 0 Ul 



Recall that 1 = D (see (2)). Using the values already found in (1), we get the 
following as concrete instantiations for the example program (with g * for gen(S{) 
and ki for kill(S'i)): 



5i = {0M)}> 02 = 0, 53 = {(x, 3)}, 54 = 1(2/, 4)}, 

ki = {(a:, 3), (a;, 5)}, k 2 = 0, k 3 = {(*, 1), (a:, 5)}, fc 4 = {(?/, 6)}. 

Note that g and k are tests. The entries in their diagonal represent what is 
generated or killed by the atomic instruction at each node. Table 1 imposes the 
condition gen(S') C kill(S') for any atomic statement S\ this translates to g < ^k 
for the matrices given above. 

Table 4 contains the equations that describe how to carry out RD analysis. 
This table has a simple and natural reading: 

1. G: To generate something, move on a path from an entry point (S*), generate 
that something ( g ), then move to an exit point while not killing what was 
generated ((S •->&)*). 

2. K: To not kill something, do not kill it on the first step and move along the 
program while not killing it, or generate it. 
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Table 4. RD existential (“may”) gen/kill parameters for CFGs. Complementation is 
relative to S* 





RD 


G 


S*-g-(S-^ky 


~K 


-ik- (S-->k)* + G 


O 


G + i-K 



3. O: To output something, generate it or, if it comes from the environment 
(the test i), clo not kill it. Note how the expression for O is close to that for 
out given in Table 3, namely gen (5) U (in(S) — kill(S')). 

We use the table to calculate G and K for RD 1 . It is a simple task to verify 
the following result (one has to use g < ->k, i.e., g t < -iki, for i = 1, . . . ,4). We 
give the result for the abstract matrices, because it is more instructive. 

/ gi g2+gi-~<k2 33 + 52 • + gi ■ -42 • - 4 3 34 + 52 • -<ki + 31 • -42 • -44 \ 

r = 0 32 33 + 32 • ->k 3 34 + 32 • - 4 4 

0 0 33 0 

\ 0 0 0 34 / 

/ — 'fcl 32 + ~ 4 l ■ -ife 33 + 32 • - 4 3 + — ifcl • -ifc2 • _l fc3 34 + 32 • _, /C4 + _| fcl --ifc2 • ^fc4\ 
TF _ 0 -<k2 33 + -42 --43 g 4 +~<k 2 --'k 4 

0 0 — 4*3 0 

\ 0 0 0 “1^4 ) 

Consider the entry G[l,3], for instance. This entry shows that what is gen- 
erated when executing all paths from label 1 to label 3 - here, since there is a 
single path, this means executing statements at labels 1, 2 and 3 - is what is 
generated at label 3, plus what is generated at label 2 and not killed at label 3, 
plus what is generated at label 1 and not killed at labels 2 and 3. Similarly, 
K[l, 2] shows that what is not killed on the path from label 1 to label 2 is either 
what is generated at label 2 or what is not killed at either of labels 1 and 2. 

To compare these results with those of the classical approach of Sect. 2, it 
suffices to collect from G the data generated and from K the data not killed 
between the entry point of the program (label 1) and its exit points (labels 3 
and 4). This can always be done by means of a row vector s selecting the entry 
points and a column vector t selecting the exit points. For our program, 



s = (l 0 0 0) 



(°\ 

! 

W 



so that 

gen = s-G-t = g 3 + g4+(g2+gi—>k2)-{->k 3 + ->k4) (6) 

1 We use bold letters for matrices. Because the concepts of Table 4 may apply to other 
kinds of KAs, the variables in the table are typeset in the usual mathematical font. 
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and 

— >ki 1 1 = s-K-t = 33 + 34 + (32 +'^'k% --ik 2 ) • ( -| /c 3 + ~>ki) • 

Negating yields 

kill = k 3 ■ k 4 + k 2 ■ ^53 • ->54 + h ■ ~^g 2 ■ ~^g 3 • ^34 • 

This is easily read and understood by looking at the CFG. Using the concrete 
values in (4) and (5) provides gen = {(#, 1), (x, 3), (y, 4)} and kill = {(#, 5)}, just 
like in Sect. 2. 

One can get K from the above matrix K by complementing, but complemen- 
tation must be done relatively to S*; the reason is that anything that gets killed 
is killed along a program path, and unconstrained complementation incorrectly 
introduces nonzero values outside program paths. The result is 

k2 + ki • -132 k 3 +k 2 - -'33 + ki ■ ^32 • ^g 3 k± + k 2 • -.34 + fci • -132 ■ ^34 

fc 2 k 3 + fc 2 • -133 ki + fc 2 • -134 

0 k 3 0 

0 0 k 4 

Oue might think that kill = s-K-t, but this is not the case. It is easy to see 
that s-K-t = K[l, 3] + K[l,4], whereas the value of kill that we obtained above 
is ->(K[1,3] + K[l,4]) = K[l, 3] -K[l, 4], and the latter is the right value. The 
reason for this behavior is that for RD, not killing , like generating , is existential, 
in the sense that results from converging paths are joined (“may” analysis). In 
the case of killing, these results should be intersected. But the effect of s-K-t is 
to join all entries of the form K[entry point, exit point] (K[l, 3] + K[l, 4] for the 
example). Note that if the program has only one entry and one exit point, one 
may use either s-K-t or ->(s-K-t); the equivalence follows from the fact that s 
is then a total function, while t is injective and surjective. 

There remains one value to find for our example, that of O. In Table 4, the 
equation for O is O = G + i- K. The symbol i denotes a test that characterizes 
the data that come from the environment of the program (a larger program 
containing it). For our example, i has the form 

/ i\ 0 0 0 \ 

. [ 0 i 2 0 0 I 

1 0 0 i 3 0 • 

\ 0 0 0 i 4 J 

Using the same s and t as above, we calculate s-O-t, which is the information 
that gets out at the exit points as a function of what gets in at the entry point. 
We get 

s-O-t = s • (G + i • K) • t = 33 T <74 T 

32 • (“ , &3 + “’^4) + 

3l ' “ , &2 • (“’^3 + -'ki) + 
ii ■ —<ki ■ ~^k 2 • (—'ks + ^Aq) • 
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Thus, what gets out at exit labels 3 and 4 is what is generated at labels 3 and 4, 
plus what is generated at labels 1 and 2 and not killed after, plus what comes 
from the environment at label 1 and is not killed after. With the instantiations 
given in (4) and (5), 

s O t = {(a;, 1), (£,3), (y, 4)} + ii • {(a;, 1), (y, 4), (y, 6)} . 

If - as in Sect. 2 - we assume that the data coming from the environment at 
label 1 is i\ = {(a:, 5), (y, 6)}, then s O t = {(a;, 1), (a;, 3), (y,4), (y, 6)} - as in 
Sect. 2. 

We now turn to the LTS representation of programs, which is often more 
natural. For instance, it is used for the representation of automata, or for giving 
relational descriptions of programs [19]. Our example program and its LTS graph 
are given in Fig. 3. As mentioned in Sect. 2, we do not distinguish the two possible 
run-time results of the test x > z, since this does not change any of the four 
analyses. 




S = 



k = 



0 

0 

0 

0 

Vo 

(° 

1 0 

0 

0 

Vo 

(° 

1 0 

0 

0 

Vo 



1 0 0 0 \ 

0 110 
0 0 0 1 

0 0 0 1 

0 0 0 0 / 

yi 0 0 0 \ 

0 52 ya 0 

0 0 0 y3 

0 0 0 y4 

0 0 0 0 / 

fci 0 0 On 

0 k 2 k 2 0 

0 0 0 k 3 

0 0 0 A?4 

0 0 0 0 / 



Fig. 3. LTS for running example 



The matrices S, g and k again respectively represent the structure of the 
program, what is generated and what is killed by atomic statements. Note that 
g and k are not tests as for the CFG representation. Rather, entries fy and ki 
label arrows and in a way can be viewed as an abstraction of the effect of the 
corresponding statements. The concrete instantiations (4) and (5) still apply. 

Table 5 can then be used in the same manner as Table 4 to derive G, K 
and O. In this table, as usual, a + denotes a -a*. The variable i still denotes 
a test. The operator ~ denotes complementation with respect to S + , so that 
K = K n S + . For CFGs, complementation is done with respect to S*, because 
the instructions are on the nodes. For LTSs, the instructions are on the arcs, so 
that no killing or generation can occur at a node, unless it occurs via a nonnull 
circular path; this explains why complementation is done with respect to S + . 
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Table 5. RD existential (“may”) gen/kill parameters for LTSs. The operator ~ is 
complementation relative to S + 





RD 


G 


S*-g-(knS)* 


K 


(kns)+ + G 


O 


G + i-k 



The calculation of G gives 



G = 



(° 

' 0 

0 

0 

Vo 



Using 



9 1 
0 
0 
0 
0 



92 + gi ■ 

92 

0 

0 

0 



1^2 



92 + gi- 
92 
0 
0 
0 



A: 2 



93+ gi + ( 92 + gi ■ ~'k 2 ) • (-'/C3 + -'fci) \ 
93 + gi + 52 • (->fcs + — i Aq) 



93 

9* 

0 



f°\ 

1 0 ' 



0 

0 

Vi / 



s=(l 0 0 0 0) and t = 

we obtain for gen = s-G-t the same result as for the CFG (see (6)). 



4.2 Gen/Kill Analysis for the Four Analyses 

In this section, we present the equations for all four analyses. We begin with the 
CFG representation (Table 6). The following remarks are in order. 

1. Reading the expressions is mostly done as for RD. For LV and VBE, it is 
better to read the expressions backward, because they are backward analyses. 
The reading is then the same as for RD, except that o is used instead of i 
to denote what comes from the environment and / is used instead of O for 
the result. This was done simply to have a more exact correspondence with 
Table 3. Although o is input data, the letter o is used because the data is 
provided at the exit points; similarly, the letter I is used for the output data 
because it is associated with the entry points. 

2. For all atomic and compound statements and all equations of Table 6, one can 
do the same kind of abstract comparison with the results given by Tables 1 
and 2 as we have done for the example in Sect. 4.1. The results are the same. 

3. The forward/backwarcl and may/must dualities are apparent in the tables 
of Sect. 2, but they are much more visible and clear here. 

(a) The forward/backward correspondences RD <-> LV and AE <-> VBE 
are obtained by reading the expressions in the reverse direction and 
by switching in and out: i <-> o and O <-> I. One can also use the 
relational converse operator then, for LV, G = (~>k ■ S)* ■ g ■ S* = 
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Table 6. Existential (“may”) gen/kill parameters for CFGs. Complementation is rel- 
ative to S* 





RD 


LV 


AE 


VBE 


G 


sr-g-(s-^ky 


(- k-SY-g-S * 






G 






^g.(S^g)*+K 


{^g-Sy^g + K 


K 






S* -k- (S --ig)* 


(- g-sy-k-s * 


~K 


-ik-(S--ik)* + G 


(->k- S)* -->k + G 






O 


G + i-kC 








I 




G + K-o 






O 






K + -ii-G 




1 








K + G-->o 



(S~* -g~- (S“-T&r)*)A The same can be clone for K and I. Thus, to 
make an LV analysis, one can switch i, o, reverse the program, do the 
calculations of an RD analysis, reverse the result and switch J, O (of 
course, g and k are those for LV, not for RD). The same can be said 
about AE and VBE. 

(b) The may/must duality between RD and AE is first revealed by the fact 
that G, K and O are existential for RD, whereas G, K and O are existen- 
tial for AE (similar comment for LV and VBE). But the correspondences 
RD <-> AE and LV VBE are much deeper and can in fact be obtained 
simply by switching gen and kill, and complementing in and out: g <-> k, 
G K, i <-> -i i, o <-> -i o, / <-» h O <-» o. 

These dualities mean that only one kind of analysis is really necessary, since 
the other three can be obtained by simple substitutions and simple additional 
operations (converse and complementation). 

4. All nonempty entries in Table 6 correspond to existential cases; collecting 
data with entry and exit vectors as we have done in Sect. 4.1 should be done 
with the parameters as given in Table 6 and not on their negation (unless 
there is only one entry and one exit point). 

5. Table 6 can be applied to programs with goto statements to fixed labels 
without any additional machinery. 

The equations for LTSs are given in Table 7. Similar comments can be made 
about this table as for Table 6. 

The formulae in Tables 6 and 7 are obviously related, but it is interesting 
to see how the connection can be described formally and this is what we now 
do. We will also show the following: If P denotes any of the parameters in the 
left column of either Table 6 or Table 7 and if s and t denote the entry and 
exit vectors appropriate for the representation (CFG or LTS), then the value of 
s • P -t is the same for both approaches, provided only the existential equations 
given in the tables are used (no complementation before merging the data with 
s and t). Note that the main goal of the analyses is to calculate s-O-t or s-I -t. 
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Table 7. Existential (“may”) gen/kill parameters for LTSs. The operator ~ is comple- 
mentation relative to S + 





RD 


LV 


AE 


VBE 


G 


S*-g-(knS)* 


(knSy-g-S* 






G 






{gnS)++K 


(gnS)++K 


I< 






S*-k-(g n S)* 


{gnSy-k-S* 


K 


(kn S)+ +G 


(kn S)+ +G 






O 


G + i-K 








I 




G + K-o 






6 






K + ->i-G 




I 








K + G--io 



The basic idea is best explained in terms of graphs. To go from a CFG to an 
LTS, it suffices to add a new node and to add arrows from the exit nodes of the 
CFG to the new node - which becomes the new and only exit node - and then 
to “push” the information associated to nodes of the CFG to the appropriate 
arrows of the LTS. 

Let us see how this is done with matrices. For these explanations, we append 
a subscript l to the matrices related to a LTS. The following matrices S and 
Sl represent the structure of the CFG of Fig. 2 and that of the LTS of Fig. 3, 
respectively. 



/ 0 1 0 0 \ 
0 0 1 1 
0 0 0 0 
\0 0 0 0 / 



(o 


i 


0 


0 


°\ 


0 


0 


1 


1 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


1 


V° 


0 


0 


0 


V 



The matrix Sl is structured as a matrix of four submatrices, one of which is 
the CFG matrix S and another is the column vector t that was used in (6) to 
select the exit nodes of the CFG. The role of this vector in Sl is to add links 
from the exit nodes of the CFG to the new node corresponding to column 5. 

Now consider the matrices of Sect. 4.1. The CFG matrix g can be converted 
to the LTS matrix g, here called gL, as follows. 



gL 



gS gt\ 

0 0 ) 



g x 

o y 



S t 
0 0 



•S L 



The value of the submatrices x and y does not matter, since these disappear in 
the result of the composition. One can use the concrete values given in Sect. 4.1 

f cr . § cr . 

and check that indeed in that case gL = ( 0 ) ' same f° r ^ an d 

K. The matrix ) is an embedding of the CFG g in a larger graph with 
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an additional node. Composition with Sl “pushes” the information provided by 
g on the appropriate arrows of gn- 

We now abstract from the matrix context. We assume an heterogeneous 
Boolean KA such that, given correctly typed elements a,b,c,d, it is possible 

to form matrices like . To distinguish expressions related to CFGs and 



LTSs, we add a subscript l to variables in the latter expressions. 

We will show that the RD CFG expressions given in Table 6 can be trans- 
formed into the corresponding LTS expressions given in Table 7. An analogous 
treatment can be done for the other analyses. 

We begin with G, whose expression is S*-g-(S- ~>k )* , and show that it trans- 
forms to ■ c/l ' {k\j n iSl)*- We first note the following two properties: 
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= ( Assumptions ) 

fcZnS'L 

An now comes the proof that Gl = -gh ■ (&l n Si,)*. 

G l 

= ( Assumption ) 

(G-S G-t\ 

1 0 0 ) 

( Matrix composition ) 

G ?\ (S t 

o ?y ' V° 0 

( Expression for G and definition of f ) 

( Definition of / and induction using (7) ) 

= ( KA sliding rule: (a-b)* ■ a = a- (b- a)* } 

= ( Assumptions and (8) ) 

si- 9 L-(Kns L y 

The transformation of the CFG subexpression ->fc- (S ■ ->k)* (appearing in the 
definition of K in Table 6) is done in a similar fashion, except that the last steps 
are 

fhk)-{f{S)-f{^k))*-f{S) 

= ( KA sliding rule: (a-b)* -a = a-(b-a)* ) 

/H07(SM/H07(S))* 

= ( a- a* = a + and (8) ) 

(*£ n 5 l )+ 




What remains to establish is the correspondence between i for CFGs and 
for LTSs. Since we want ij, to be a test just like i, we cannot take *l = /(*) • f(S) 

like for the other matrices. It turns out that zl = ft is convenient and is 



indeed what we would choose using intuition, because it does not make sense to 
feed information to the additional exit node, since it is past all the instructions. 
One then has 



O l = 



(G + i-K)-S ( G + i-K)-t 
0 0 



— Gl + 



i 0 
0 0 



• (K) l = Gl + zl- Kl 
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Finally, we show that the calculation of the information along paths between 
entry and exit nodes gives the same result for CFGs and LTSs. For CFGs, this 
information is obtained by calculating s- P-t, where s is the vector of entry 
nodes, t is the vector of exit nodes and P is any of the existential parameters 
( G , K , O, . . .), depending on the analysis. For LTSs, the corresponding expression 

is sl-.PlTl. As above, we assume Pl = ^ ^0^)’ anc ^ SL = ^ an d 

t\ J = , meaning essentially that the additional node cannot be an entry 

point and must be the only exit point. We then have 



S'L-Ph'th = (s 0) 



P-S 

0 




s-P-t , 



so that using either a CFG or an LTS gives the same result. 

Note that no property of s or t has been used in the explanation of the 
transformation from CFGs to LTSs. 



5 Conclusion 

We have shown how four instances of gen/kill analysis can be described using 
Kleene algebra. This has been done for a CFG-like and an LTS-like representa- 
tion of programs (using matrices). The result of this exercise is a very concise 
and very readable set of equations characterizing the four analyses. This has re- 
vealed the symmetries between the analyses much more clearly than the classical 
approach. 

We have in fact used relations for the formalization, so that the framework of 
relation algebra with transitive closure [15, 16] could have been used instead of 
that of Kleene algebra. Note however that converse has been used only to explain 
the forward/backward dualities, but is used nowhere in the calculations. We 
prefer Kleene algebra or Boolean Kleene algebra because the results have wider 
applicability. It is reasonable to expect to find examples where the equations of 
Tables 6 and 7 could be used for something else than relations. Also, we hope to 
connect the Kleene formulation of the gen/kill analyses with representations of 
programs where Kleene algebra is already used. For instance, Kleene algebra is 
already employed to analyze sequences of abstract program actions for security 
properties [11]. Instead of keeping only the name of an action (instruction), it 
would be possible to construct a triple (name, gen, kill) giving information about 
the name assigned to the instruction and what it generates and kills. Such triples 
can be elements of a KA by applying KA operations componentwise. Is it then 
possible to prove stronger security properties in the framework of KA, given that 
more information is available? 

We plan to investigate other types of program analysis to see if the techniques 
presented in this paper could apply to them. We would also like to describe the 
analyses of this paper using a KA-based deductive approach in the style of 
that used in [21]. Another intriguing question is whether the set-based program 
analysis framework of [22] is related to our approach. 
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Abstract. In this paper we define Kleene algebra with tests in a slightly 
more general way than Kozen’s definition. Then we give an explicit con- 
struction of the free Kleene algebra with tests generated by a pair of sets. 
We also show that the category KAT of Kleene algebras with tests and 
the category KAT- of Kozen’s Kleene algebras with tests are related by 
an adjunction. This fact shows that an infinitely-generated free Kleene 
algebra with tests in the sense of Kozen can be obtained as the image of 
our free algebra under the left adjoint from KAT to KAT-; moreover, 
the image is isomorphic to itself. Therefore, our free Kleene algebra with 
tests is isomorphic to Kozen and Smith’s free Kleene algebra with tests 
if their construction available. Finally, we show that Kozen and Smith’s 
free Kleene algebra with tests can be presented as a coproduct of Kleene 
algebras. This is induced from our free construction. 



1 Introduction 

Kozen [6] defined a Kleene algebra with tests to be a Kleene algebra with an 
embedded Boolean algebra. The starting point of this paper is the observation 
that the important point of this notion is not the subset property, but the fact 
that their underlying idempotent semiring structure is shared. Due to this obser- 
vation, we define a Kleene algebra with tests as a triple consisting of a Boolean 
algebra, a Kleene algebra, and a function from the carrier set of the Boolean 
algebra to the carrier of the Kleene algebra which is a homomorphism between 
their underlying idempotent semiring. So the category KAT of our Kleene alge- 
bras with tests is the comma category ( Ubi , Uki) of the functor Ubi from the 
category Bool of Boolean algebras to the category ISR of idempotent semirings 
and the functor Uki from the category Kleene of Kleene algebra to ISR. 

Kozen and Smith [7] showed that the Kleene algebra with tests consisting of 
the set of regular sets of guarded strings is the free Kleene algebra with tests 
generated by a pair of finite sets. [3,4] showed the existence of the free algebra 
generated by a pair of sets using the technique of finite limit sketches instead of 

* Part of this work is done during the author is visiting at McMaster University by 
Japan Society for the Promotion of Science (JSPS) Bilateral Programs for Science 
Exchanges. 
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showing an explicit construction. Though these two results were established on 
slightly different definitions, we did not analyse the difference of these two. 

In this paper, we give an explicit construction of the free Kleene algebra with 
tests generated by a pair of sets using adjunctions between Kleene and the cate- 
gories of related algebraic structures, and coproducts in Kleene. Though Kozen 
and Smith’s construction requires finiteness of generators, our construction never 
requires this. We also show that the category KAT of our Kleene algebras with 
tests and the category KAT- of Kozen’s Kleene algebras with tests are related 
by an adjunction. The image of our free algebra under the left adjoint from KAT 
to KAT- is isomorphic to itself. So, our free algebra and Kozen and Smith’s 
are isomorphic whenever Kozen and Smith’s construction is available, that is, 
a generator of the Boolean algebra is finite. Finally, we show that Kozen and 
Smith’s free algebra can be presented as a coproduct in Kleene since our free 
algebra is also given as a coproduct. Our explicit free construction makes clear 
such an algebraic characterisation of Kozen and Smith’s free algebras. 

2 Kleene Algebras 

In this section we recall some basics of Kleene algebras [5] and related algebraic 
structures. [2] contains several examples of Kleene algebras. For basic notions of 
category theory we refer to [1,8]. 

Definition 1 (Kleene algebra). A Kleene algebra is a set K equipped with 
nullary operators 0, 1 and binary operators +, •, and a unary operator *, where 
the tuple (K, +,-,0, 1) is an idempotent semiring and these data satisfy the 
following: 

1 + {P'P*)=P* 

1 + {P*'P)=P* 
p ■ r < r => p* ■ r < r 
r ■ p < r => r ■ p* < r 

where < refers to the natural partial order 

, ,def 

P< q^=^p + q = q . 

A Kleene algebra will be called trivial if 0 = 1, otherwise, called non-trivial. The 
category of Kleene algebras and homomorphisms between them will be denoted 

by Kleene. 

Remark 1. Kleene has binary coproducts. 

For two Kleene algebras Ki and K 2 , Ki + K 2 denotes a coproduct of Ki and 
K 2 . For two Kleene algebra homomorphisms /: Ki — > K 2 and g : Kj — > K.' 2 , 
f + g denotes the unique Kleene algebra homomorphism such that the following 
two diagrams commutes. 
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Kr 

/ 

K' x 



i 



Kx+K 2 

f + 9 

I 

k; + k:. 



j 




K 2 

5 



where i and j are the coproduct injections of Ki + K 2 , and i' and j' of Kj +Kj. 

The coproduct injections in Kleene are not always one-to-one. Trivial Kleene 
algebras have only one element since, for each a, 



a = a- l = a- 0 = 0 . 



For each Kleene algebra K , there exists a unique Kleene algebra homomorphism 
from K to each trivial one. From a trivial Kleene algebra, there exists a Kleene 
algebra homomorphism if the target is also trivial. So, the coproduct of a trivial 
Kleene algebra and a non-trivial one is trivial again. Then, we have an injection 
which is not one-to-one. This example is due to Wolfram Kahl. 

A Kleene algebra K is called integral if it has no zero divisors, that is, 

a / 0 A b 0 => a ■ b y^ 0 

holds for all a, 6 G K. This notion is introduced in [2]. 

Proposition 1. Let 3 = ( J, + j, • j, * J , Oj, lj) and K= (K, + K , - K , * K , 0 K , Ik) 
be non-trivial Kleene algebras. If K is integral, then the following holds. 

(i) The mapping f:K^J defined to be /(a) = Oj if a = Ok, and otherwise 
f{a) = 1 j, is a Kleene algebra homomorphism. 

(ii) The first injection j : J — » J + K is one-to-one. 

Proof. Since K is non-trivial, we possibly have a Kleene algebra homomorphism 
from K to a non-trivial one. For each a,b £ K , if a yf Ok and b ^ Ok, a+Kb yf Ok 
and a -k b ^ Ok since K is integral. So, (i) follows from 

/(«) +J f(b) = 1 j +j lj = lj = f(a +k b) 
f( a ) ‘ J f{b) = 1 j ■ j lj = lj = f{a ■ k b) 
f(a)* J = 1 y = 1 J = f(a*K) . 

(ii) will be proved using the / given in (i). Take idj and /, then a unique 
intermediating arrow h : J +K — > J with respect to them exists. By the definition 
of coproducts, h satisfies idj = ho j. Thus j is one-to-one. 

Let Set, ISR, and Bool denote the categories of sets and functions, idem- 
potent semirings and their homomorphisms, and Boolean algebras and their ho- 
momorphisms, respectively. Uk ■ Kleene — > Set denotes the forgetful functor 
which takes a Kleene algebra to its carrier set. The functor Uk can be decom- 
posed into functors Uki '■ Kleene — > ISR and Ui : ISR — > Set, where Uki{ K) 
is an idempotent semiring obtained by forgetting the * operator and Ui takes 
an idempotent semiring to its carrier set. These two functors Uki and Ui have 

dcf 

left adjoints Fjk and Fj respectively. Fk = Fjk ° Fj is a left adjoint to Uk ■ 
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Remark 2 (Reg(I7) [5]). For a set £, Reg(JC) denotes the Kleene algebra con- 
sisting of the set of regular sets over £ together with the standard opera- 
tions on regular sets. Clearly, Reg(I7) is integral. Moreover, it is known that 
Reg(r) =*F K {E). 

Since we consider Kleene algebras with tests in this paper, Boolean algebras 
are also important. We denote the forgetful functor which takes a Boolean alge- 
bra to its carrier set by Ub ■ Bool — > Set. Ub satisfies similar properties to Uk 
together with [//. We denote the forgetful functor from Bool to ISR and its left 

dcf 

adjoint by Ubi and F IB respectively. F B = F IB o Fj is a left adjoint to U B . 
The situation we state above is as follows: 



Fib 

Bool _L 



ISR 



Fjk 

± Kleene 



Ubi 



Ui 



Uki 



Fi 



Set 



F b = F ib o F t F k = Fi K o Fi 
Ub = Ui o Ubi Uk = Uj o Uki 

Let us now explicitly state some of the facts constituting the adjunction 
Fik H Uki • For each idempotent semiring S and Kleene algebra K, the bijection 
from Kleene(F/x(S), K) to ISR(S, t/ic/(K)) is denoted by Indeed, for 

each arrow / : S' — > S in ISR and g: K — > K' in Kleene, the following diagram 
in Set commutes: 



Kleene(F/*-(S),K) > ISR(S, U K i(K)) 



Kleen e(Fi K (f),g) 



ISR (f,U K i(g)) 



Kleene(F/if (S'), K') ► ISR(S', Uki(K')) 

<P8>,K' 



where Kleen e(FiK(f), g) maps 

Kleene(F/i<- (S), K) 9 fcn gohoFi K {f) £ Kleene(F /if (S'), K') 
and ISR(/, UKi(g)) does 



ISR(S,[fr/(K)) 9 k^U K i(g)okof e ISR(S',t/^(K')) . 

This property is called naturality of ip. The subscript S, K will be omitted unless 
confusions occur. 

The next property follows from the naturality. 
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Lemma 1. Diagram (1) commutes if and only if diagram (2) commutes. 

i 



Fik{ S) 



K 



s —fill Uk ,(K) 



FikU) 



(1) 



(2) 



Uki{9) 



f ik ( S') 



K' 



S' 



<p(j) 



VkA k') 



Proof. Assume that (1) commutes, j is taken to ip(j o Fjx(f)) and ip(j) o 
f by if o Kleene^/# (/), id K ') and ISR(/, C/ic/(id K ')) ° <P, respectively. By 
the naturality of , we have ip(j o Fix(f)) = p(j) ° f ■ Similarly, considering 
Kleene(F/x (ids), g) and ISR(ids, Uxi{g )), we obtain ip(goi) = Uxi{g) ° p{i)- 
By the assumption, we have ip(j) o / = Uxiig) ° p{f)- The opposite direction is 
proved similarly using bijectivity in addition to naturality. 



For each idempotent semiring S, the idempotent-semiring homomorphism 
y>(id F/K ( S )) : S — * Uki(Fik(S)) is called a component of the unit of the ad- 
junction Fjk H Uki with respect to S. By Lemma 1, <^(id Fj . x (s)) satisfies 
/ = UKi{p~ 1 {f)) ° < / 3 (idF Jif (s)) for each Kleene algebra K and each idempotent- 
semiring homomorphism /: S — > 17r-/(K) since the following diagram com- 
mutes. 



f ik ( S) Fik{S) 



Fik (ids) 



V~\f) 



K 



Proposition 2. Let S = [S, +, •, 0, 1) be an idempotent semiring such that , for 
each s £ S , s < 1. Then, Fjk(S) = (S, 0, 1), where * maps each s £ S 

to 1. 



Proof. Set S' = (S, 0, 1). For each Kleene algebra K, an idempotent- 

semiring homomorphism h: S — > £/r-/(K) is also a Kleene algebra homomor- 
phism from S' to K by the definition of S' and the assumption on S. Take 
(^(id^^s)) : S — > Uki{Fik{S)), then the following diagram commutes. 



S 



ids 



S 



ids 



■ U KI { S') (=S) 

f/K/(^(id F/K(S ))) (=<p(id F/jr(s) )) 
- Uki(F ik ( S )) 



v{id FlK (s)) 
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By Lemma 1, the following diagram commutes. 



Fik{ S) 
Fik (ids) 

F ik ( S) 



P 1 (ids) 



</>(idiy*(S)) 

1 

-. Fik(S) 

ld F IK ( S) 



So, we have idp IK (S) = V 5 (idF /K (S )) 0 ¥ 5 1 (ids). Since </?(idF /K (s)) is a component 
of the unit of Fik H Uki with respect to S, and S = Uki( S'), we have ids = 
^if/(^ _1 (ids)) o ¥>(id F/K(s) ). Here, U K i(p~ l ifd s )) is ^ _1 (id s ) itself. Thus we 
have ids = <^ _1 (id s ) o tp{ id FlK ^). 

Proposition 3. Let B = ( B , +, •, “, 0, 1) and S = (S, +, •, 0, 1) be a Boolean al- 
gebra and an idempotent semiring , respectively. The image im(/) of B under the 
idempotent- semiring homomorphism f: Ubi(B ) — > S forms a Boolean algebra. 



Proof. The set im(/) is closed under + and •. Also, im(/) is a distributive lattice 

together with +, •, 0, and 1. For each x £ im(/), define ->x = f /(a) for some 
a £ B such that x = /(a). ->x is well defined since, if /(a) = /(b). 



/(a) = / (a)j 1 = /(a) • (/(6) + /(b)) = /(a) • (/(b) + /(«))_= f(a)_- f(b) 

= (f(b) ■ m + (/(«) • f{b)) = ( /(a ) + /(a)) • /(b) = 1 • /(6) = /(&) . 



Then, (im(/), +, •, -i, 0, 1) is a Boolean algebra. 

For idempotent-semiring homomorphism /: Ubi(B) S, the Boolean algebra 
(im(/), +, •, -i,0, 1) will be denoted by /[ B]. 



3 Kleene Algebras with Tests 

We provide a definition of Kleene algebras with tests. The definition is slightly 
more general than Kozen’s. 

Definition 2 (Kleene algebra with tests). A Kleene algebra with tests is a 
triple (B, K, i) where 

B is a Boolean algebra, 

- K is a Kleene algebra, and 

— i is an idempotent-semiring homomorphism from Ubi(B) to £/r-/(K). 

The category of Kleene algebras with tests and their homomorphisms, which is 
denoted by KAT, is the comma category ( Ubi,Uki ), that is, an arrow (/,<?) 
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from (B, K, i) to (B', K', i') in KAT is a pair of a Boolean algebra homomor- 
phism /: B — > B' and a one of Kleene algebra g : K — > K' such that the 
following diagram commutes. 



Ubi (B) U K i( K) 



UbiU) 



U K i(g) 



Ubi( B') -- U K i( K') 

i 



For details of comma categories, see [8]. 

A Kleene algebra with tests in the sense of Kozen [6] is a special case where 
i is an inclusion. KAT- denotes the category of Kleene algebras with tests in 
the sense of Kozen. 

Example 1. The triple 

(({0) 1) o, ffl}j V, A, - , 0, 1}), ({0, 1}, + 0, 1), {0 i—> 0,1 *—>■ 1, a i— > 1, a e-> 0}) 

is an object in KAT but not an object in KAT-. 

The following property is immediate from Proposition 3. 

Proposition 4. For each object (B,K,«) in KAT. (*[B],K,C) is an object in 
KAT-, where *[B] is the Boolean algebra consisting o/im(i). Moreover, if i is 
one-to-one, (B,K,i) = (*[B],K,C). 

Definition 3 (free Kleene algebra with tests). A free Kleene algebra with 
tests generated by a pair (T, A 1 ) of sets T and E is defined to be a Kleene algebra 
with tests (B, K, i) and a pair {gr, g s) of maps gr'- T — > B from T to the carrier 
set B of B and gs '■ E — > K from E to the carrier set K of K which satisfy the 
following universal property: 

for each Kleene algebra with tests (B', K', i') and each pair (/, g) of maps 
/: T — > B' from T to the carrier set B' of B' and g: E — » K' from E 
to the carrier set K' of K', there is a unique arrow ( f,g ): (B,K,i) — > 

(B', K', i') in KAT such that 

/ = / ° gr and g = g o g E . 

Kozen and Smith [7] gave a construction of the free Kleene algebra with tests 
generated by a pair of finite sets of atomic tests and atomic actions through a 
construction of the set of languages of guarded strings. Though Kozen and Smith 
require the finiteness of the two sets, it is not necessary that the set of atomic 
actions is finite. Since their result is based on Kozen’s definition, i and i' in 
definition 3 are required to be inclusions. After giving our construction of free 
Kleene algebra with tests (in Section 4), we compare Kozen-Smith’s and ours 
(in Section 5). 
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4 Free Construction of Kleene Algebra with Tests 



In [3,4], the existence of the free Kleene algebra with tests generated by a pair 
of (possibly infinite) sets was shown using the technique of finite limit sketches. 
With this technique, an explicit construction is not necessary for the proof, so 
none was given. This section provides an explicit construction using adjunctions 
and coproducts in Kleene. This construction does not require the finiteness of 
generators. 

If the forgetful functor U : KAT — > Set x Set which takes an object (B, K, /) 
to a pair ( B,K ) of their carrier sets and an arrow (h,k) to a pair ( h,k ) of 
functions has a left adjoint, the image of the pair (T, E) of sets under the left 
adjoint together with the unit of the adjunction is the free Kleene algebra with 
tests generated by (T, E). 

Since we already have the adjunctions Fb H Ub and Fk H Uk, the func- 
tor Fb x Fk : Set x Set — > Bool x Kleene is a left adjoint to the functor 
Ub x Uk '■ Bool x Kleene — * Set x Set. 

Define F to be the functor from KAT to Bool x Kleene which takes an 
object (B,K,/) to the pair (B,K) of algebras and an arrow (h, k ) to the pair 
(h, k) of homomorphisms. Then it holds that U = (Ub x Uk) ° F. So, if F has 
a left adjoint, we obtain a left adjoint to U. Thus we may have the free Kleene 
algebra with tests generated by a pair of sets. 

For a pair of a Boolean algebra B and a Kleene algebra K, we have a Kleene 
algebra with tests 

(B, F ik (U B i(B)) + K, ip(i)) 
where i is the first injection of the coproduct 

Fik(U B i( B)) — F ik (Ubi(B))+kJ- K 

And for a pair of a Boolean algebra homomorphism / : B — » B' and a Kleene 
algebra homomorphism g: K — > K' we have two idempotent-semiring homomor- 
phisms Usi{f) and UKi(FiK{U B i{f)) + g)- Then these two satisfy the following. 

Proposition 5. <p(i')oU B i(f) = UKi(FiK(UBi(f))+ g)°<p(i) where i andi 1 are 
the first injections of the coproducts Fik(Ubi(B)) + K and Fjk{Ubi{B')) + K' , 
respectively. 

Proof. By the definition of FjK{Usi{f)) + g the diagram 



F ik (U B i( B)) 
Fik{UbiU)) 

Fik(Ubi(B ')) 



— F ik (U B i( B))+K 

Fik{UbiU)) + g 
-r f ik (u bi ( B')) + K' 

l 
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commutes. So the diagram 



U B i( B) U ki (Fi K (Ubi( B))+K) 



Usi(f) 



UKi(F IK (UBi(f)) + g) 



U bi (B') —* U ki (F ik (U B i( B')) + K') 
W ) 



commutes by Lemma 1. 



Therefore, ( f,F IK (U B i{f )) + g) is an arrow from (B, F ik (Ubi(B)) +K,ip(i)) 
to (B', Fik{Ubi(B')) + K', in KAT. 

Define <P to be the functor from Bool x Kleene to KAT which takes an 
object (B,K) to the object (B, Fik(JJbi(R)) +K ,<p(i)) and an arrow (f,g) to 
the arrow (/, Fik (U siif)) + g)- Then the following holds. 

Theorem 1. <P is a left adjoint to F. 



Proof. Define the mapping £(B,K),(B',K',i') from KAT($(B, K), (B', K', i')) to 
Bool x Kleene((B, K), ^((B', K', i'))) for each object (B, K) in Bool x Kleene 
and (B', K', i') in KAT as follows: 



B 



F ik (U bi ( B)) + K 



B' 



K' 



B 



K 



( / 



g°j ) 



B' 



K' 



where j is the second injection of Fik{Ubi( B) ) + K. In the sequel, f means 
£(B.K).(B',K',i')- It is sufficient to show that £ is bijective. Assume that £((/, g}) = 
£((/'> SOX that is, f = f and go j = g’ oj. Since (f,g) and (/', g') are arrows 
in KAT, the following diagram commutes both for y = g and for y = g' . 



Ubi( B) — Uki(F ik (U B i( B)) +K) 



UbiU) 



U K i{y ) (y = g or g') 



Ubi( B') 



1 ‘ 



U K i( K') 
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So, by Lemma 1, the diagram 



Fik(U B i( B)) F ik (Ubi( B))+K 



Fik(Ubi (/)) 



y (y = goi g') 



f ik (U bi ( b')) -T3-- K' 

<p \i') 

commutes again. Since g o j = g' o j is assumed, it holds that g = g' by 
the uniqueness of the intermediating arrow of Fik(Ubi{B)) + K with respect 
to o FiK{Usi{f)) and g' o j. Therefore, £ is one-to-one. Given an ar- 

row (h,k): (B, K) -► F((B', K', ?')) in Bool x Kleene, we obtain two arrows 
Usi{h) and UKi(m ) in ISR, where m is the unique intermediating arrow of 
Fik(Ubi( B)) + K with respect to o FiK{Usi{h)) and k. By the defini- 

tion of m, the diagram 



F ik {Ubi{ B)) 



FiK{Usi(h)) 



F ik (U B i( B')) 



F ik {Ubi{ B))+K 



K' 



commutes. So, by Lemma 1, the following diagram commutes, too. 



Ubi(B) 



ip{i) 



U ki (F ik (U B i(B)) + K) 



Usi(h) 



U K i(m) 



Ubi{ B') — 



U K i( K') 



So, (h,m) is an arrow from <£(B,K) to (B',K',*') in KAT. Moreover, since 
£((/i, m)) = (h 7 mo j) and mo j = k, f((h,m)) = ( h,k ). Therefore, £ is onto. 

Corollary 1. A component of the unit of F H F with respect to an object (B, K) 
in Bool x Kleene is (idg, j 1 ), where j is the second injection of the coproduct 
F ik (Ubi( B))+K. 

Proof. It is immediate from £(id<f(B.K)) = (ids,:?)- 
Corollary 2. F o {F B x Fk ) is a left adjoint to U . 

Corollary 3. (Fb{T), Fjk(Ubi(Fb{T))) + Fk{£), <p{i)) together with the pair 
{r]' T: UkU) 0 Vb) of maps rf T and UkU) ° rff, is the free Kleene algebra with tests 
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generated by the pair (T,£) of (possibly infinite) sets, where i and j are the 
coproduct injections of Fik(Ubi(Fb(T))) + Fk(£), rf and r\" are the units of 
Fb H Ub and Fk H Uk ■ 



5 Comparison 



In this section Kozen-Smith’s free Kleene algebra with tests and ours are com- 
pared. 

We begin to compare KAT and KAT-. The category KAT- is a full sub- 
category of KAT. The forgetful functor from KAT- to KAT is denoted by G. 
Conversely, Proposition 4 provides the mapping from the set of objects in KAT 
to the set of objects in KAT-, that is, (B,K,«) i— > (?'[B],K,C), where *[B] 
is a Boolean algebra consisting of im(i). This mapping determines a functor F 
from KAT to KAT-. For an arrow (/, g) from (B, K, i) to (B', K', i') in KAT, 
F((f,g}) = (<?|im(i)) <?)) where g|i m (i) is a restriction of g with respect to the set 



Theorem 2. The functor F is a left adjoint to G. 



Proof. For each object (B,K,«) in KAT and (B',K',C) in KAT-, define the 
mapping 0(b,k,»),(B',k',c) from the homset KAT- (F((B, K, i)), (B',K',C)) to 
the homset KAT((B, K,i), U({ B', K', C))) as follows: 



i[B] 

f 

B' 



K 



B 



K 



( f°i 



9 ) 



K' 



B' 



K' 



In the sequel, 4> means 0{B.K.t),(B'.K',c> ■ For each arrow (h,k) form (B,K,«) 
to (B',K',C) in KAT, there exists an arrow (k\ lm u),k) from (i[B],K,C) to 
(B',K', C) in KAT-, and satisfies 0((ft|j m (i), k)) = (h,k) since k | im (j) o i = 
k o i = h. Thus, <fi is onto. If </>((/, g}) = (t>{{f,g')), it is immediate that g = g\ 
then we have f = f since / and f are determined by g as a restriction. So, (f> 
is one-to-one. Therefore, </> is an isomorphism. 



We now have the following sequence of adjunctions: 

Fb x Fji d* F 

Set x Set _L Bool x Kleene _L KAT j_ KAT- 

U B x U K " <F G 



The completeness theorem in [7] provides the property that the Kleene algebra 
with tests (Bt’jKt x?) C) given by Kozen-Smith’s construction is isomorphic to 
F(<P((Fb x Fk){T , £))) for a finite set T and (possibly infinite) set £, where By 
is a Boolean algebra consisting of the powerset p(Af b (t)) °f the set of atoms 



140 



Hitoshi Furusawa 



of the Boolean algebra Fb(T), K t,e is a Kleene algebra consisting of the set of 
regular sets of guarded strings over T and E. Note that, since T is finite, Fb{T ) 
is atomic and Fb(T) = B t- 

The second half of Proposition 4 suggests that if the third component ip(i) 
of the free Kleene algebra with tests (Fb(T), Fik(Ubi(Fb(T))) + Fx(E),(p(i)) 
is one-to-one, this is isomorphic to F(<P((Fb x Fx)(T, 17))) for each sets T and 
E. However, Example 1 shows that the third component of our Kleene algebra 
with tests is not always one-to-one. Also, the injections of a coproduct of Kleene 
algebras are not always one-to-one. 

For a (possibly infinite) set T, the greatest element and the least element 
of Fb{T) are not equal. So, by Proposition 2, Fik{Ubi{Fb(T))) is non-trivial. 
Also, by Remark 2, Fx{E) is integral. Thus we have the following property. 

Theorem 3. The third component ip(i) of the free Kleene algebra with tests 
( Fb(T ), Fik(Ubi(Fb(T ))) + Fk(E), ip(i )) is one-to-one for each pair (T, E) of 
sets. 

Proof. Take the free Kleene algebra (Fb(T),Fjk(Ubi(Fb(T))) + Fk {E) , . 

Uxiif) is one-to-one since, by Proposition 1, i is one-to-one and U xr(i) is i 
itself. Replace S with Ubi{Fb(T)) in the proof of Proposition 2. Then, ip(i) = 
U K i(i) ° <£>(id Fik ( Ub i{ Fb (t)))) since P^f ik {Ubi{f b {t)))) is a component of the 
unit of Fix H Uxi with respect to Ubi(Fb(T)). As in the proof of Proposition 2, 
¥>(id f ik (Ubi(f b (t)))) is an isomorphism from U B i{F B (T))' to F IK (U B i(F B (T ))) 
in Kleene. Thus, y>(id Fi K (u B i(f b (t)))) is one-to-one. Therefore, ip(i) is one-to- 
one. 

Corollary 4. Our free algebra {Fb(T) : Fix(Ubi{Fb(T)))+Fx(E),<p(i)) is iso- 
morphic to F(F((Fb x Fx)(T,E))). 

Corollary 5. If T is a finite set, (F B (T), F IK (U B i{F B (T))) + F K (E),ip(i)) 
is isomorphic to the Kleene algebra with tests (Bt, K^r, C) given by Kozen- 
Smith’s construction. 

In [3,4] it was shown that the unital quantale Q t,s consisting of the set of 
languages of guarded strings over T and E can be presented as a coproduct of 
unital quantales [9]: 

Unital quantales are monoids with arbitrary join in which left and right 
multiplications are universally additive. Left adjoints to the forgetful 
functors from the category of unital quantales to Set and to ISR ex- 
ist, which are denoted by Fq and Fjq, respectively. Then Q t,s is a 
coproduct of F iq (U B i{F b (T))) and F IQ (E). 

By Corollary 5, analogously, K t,e is presented as a coproduct of Kleene algebras. 

Proposition 6. The Kleene algebra K t,s is a coproduct of Fix{Ubi{Fb{T))) 
and Fx(E) in Kleene. 
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6 Conclusion 

Starting from a definition of Kleene algebras with tests that is slightly more 
general than Kozen’s definition, we have provided a free construction of Kleene 
algebras with tests. Though the starting point is different from Kozen and Smith, 
the results of both constructions are isomorphic to each other. Since our con- 
struction has been given as a combination of basic notions such as adjunctions 
between Kleene and the categories of related algebraic structures, and coprod- 
ucts in Kleene, the free algebras are generated quite systematically. Especially, 
the bijective correspondence <p provided by the adjunction Fjk H Uki and the 
notion of coproduct in Kleene played an important role for our free construc- 
tion. This systematic manner allowed us to construct free Kleene algebras with 
tests generated by infinitely many generators without extra effort. The fact that 
the Kleene algebras consisting of regular sets of guarded strings are presented 
as coproducts in Kleene may be helpful for more mathematical investigation of 
these Kleene algebras. 
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Abstract. Unfolds generate data structures, and folds consume them. 
A hylomorphism is a fold after an unfold, generating then consuming a 
virtual data structure. A metamorphism is the opposite composition, an 
unfold after a fold; typically, it will convert from one data representation 
to another. In general, metamorphisms are less interesting than hylomor- 
phisms: there is no automatic fusion to deforest the intermediate virtual 
data structure. However, under certain conditions fusion is possible: some 
of the work of the unfold can be done before all of the work of the fold 
is complete. This permits streaming metamorphisms , and among other 
things allows conversion of infinite data representations. We present a 
theory of metamorphisms and outline some examples. 



1 Introduction 

Folds and unfolds in functional programming [18, 28, 3] are well-known tools in 
the programmer’s toolbox. Many programs that consume a data structure follow 
the pattern of a fold; and dually, many that produce a data structure do so as 
an unfold. In both cases, the structure of the program is determined by the 
structure of the data it processes. 

It is natural to consider also compositions of these operations. Meijer [30] 
coined the term hylomorphism for the composition of a fold after an unfold. The 
virtual data structure [35] produced by the unfold is subsequently consumed by 
the fold; the structure of that data determines the structure of both its producer 
and its consumer. Under certain rather weak conditions, the intermediate data 
structure may be eliminated or deforested [39], and the two phases fused into 
one slightly more efficient one. 

In this paper, we consider the opposite composition, of an unfold after a fold. 
Programs of this form consume an input data structure using a fold, construct- 
ing some intermediate (possibly unstructured) data, and from this intermediary 
produce an output data structure using an unfold. Note that the two data struc- 
tures may now be of different shapes, since they do not meet. Indeed, such 
programs may often be thought of as representation changers, converting from 
one structured representation of some abstract data to a different structured 
representation. Despite the risk of putting the reader off with yet another neolo- 
gism of Greek origin, we cannot resist coining the term metamorphism for such 
compositions, because they typically metamorphose representations. 



D. Kozen (Ed.): MPC 2004, LNCS 3125, pp. 142-168, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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In general, metamorphisms are perhaps less interesting than lrylomorphisms, 
because there is no nearly-automatic deforestation. Nevertheless, sometimes fu- 
sion is possible; under certain conditions, some of the unfolding may be per- 
formed before all of the folding is complete. This kind of fusion can be helpful 
for controlling the size of the intermediate data. Perhaps more importantly, it 
can allow conversions between infinite data representations. For this reason, we 
call such fused metamorphisms streaming algorithms ; they are the main sub- 
ject of this paper. We encountered them fortuitously while trying to describe 
some data compression algorithms [4], but have since realized that they are an 
interesting construction in their own right. 

The remainder of this paper is organized as follows. Section 2 summarizes 
the theory of folds and unfolds. Section 3 introduces metamorphisms, which are 
unfolds after folds. Section 4 presents a theory of streaming, which is the main 
topic of the paper. Section 5 provides an extended application of streaming, and 
Section 6 outlines two other applications described in more detail elsewhere. 
Finally, Section 7 discusses some ideas for generalizing the currently rather list- 
oriented theory, and describes related work. 



2 Origami Programming 

We are interested in capturing and studying recurring patterns of computation, 
such as folds and unfolds. As has been strongly argued by the recently popular 
design patterns movement [8], identifying and exploring such patterns has many 
benefits: reuse of abstractions, rendering ‘folk knowledge’ in a more accessible 
format, providing a common vocabulary of discourse, and so on. What distin- 
guishes patterns in functional programming from patterns in object-oriented and 
other programming paradigms is that the better ‘glue’ available in the former 
[20] allows the patterns to be expressed as abstractions within the language, 
rather than having to resort to informal prose and diagrams. 

We use the notation of Haskell [25], the de facto standard lazy functional 
programming language, except that we take the liberty to use some typographic 
effects in formatting, and to elide some awkwardnesses (such as type coercions 
and qualifications) that are necessary for programming but that obscure the 
points we are trying to make. 

Most of this paper involves the datatype of lists: 

data [a] = [] | a : [a] 

That is, the datatype [a] of lists with elements of type a consists of the empty 
list [], and non-empty lists of the form a : x with head a :: a and tail x :: [a]. 

The primary patterns of computation over such lists are the fold, which con- 
sumes a list and produces some value: 

foldr :: (a — > (3 — > (3) — * (3 — > [a] — > (3 

fold rfb[] = b 

foldr/ b (a : x) = f a (foldr f b x) 
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and the unfold [16], which produces a list from some seed: 

unfoldr :: {(3 — » Maybe (a, (3)) — > /? — > [a] 

unfoldr/ b = case f b of 

Just (a, b') — > a : unfoldr/ b' 

Nothing — » [] 

Here, the datatype Maybe augments a type a with an additional value Nothing: 
data Maybe a = Nothing | Just a 

The foldr pattern consumes list elements from right to left (following the 
way lists are constructed); as a variation on this, there is another fold which 
consumes elements from left to right: 

foldl :: {(3 — > a — ■> /?) — > (3 — > [a] — * (3 

foldl f b [} = b 

foldl f b (a : x) = foldl / (/ b a) x 

We also use the operator scanl, which is like foldl but which returns all partial 
results instead of just the final one: 

scanl :: {(3 — > a — * (3) — > j3 — > [cr] — > \(3\ 

scanl f b[\ = [b] 

scanl f b (a : x) = b : scanl / (/ b a) x 

We introduce also a datatype of internally-labelled binary trees: 

data Tree a — Node (Maybe (a, Tree a, Tree a)) 

with fold operator 

foldt :: (Maybe (a, (3, (3) — » (3) — » Tree a — » (3 

foldt/ (Node Nothing) =/ Nothing 

foldt/ (Node (Just (a, t, u))) = f (Just (a, foldt/ t, foldt/ «)) 

and unfold operator 

unfoldt :: {(3 — > Maybe (a, (3, (3)) — > (3 — > Tree a 
unfoldt/ b = case / b of 

Nothing — > Node Nothing 

Just (a, 6 i, 62 ) — ► Node (Just (a, unfoldt/ 61 , unfoldt/ 62 )) 

(It would be more elegant to define lists and their recursion patterns in the 
same style, but for consistency with the Haskell standard prelude we adopt its 
definitions. We could also condense the above code by using various higher-order 
combinators, but for accessibility we refrain from doing so.) 

The remaining notation will be introduced as it is encountered. For more 
examples of the use of these and related patterns in functional programming, 
see [13], and for the theory behind this approach to programming, see [12]; for 
a slightly different view of both, see [3]. 
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3 Metamorphisms 

In this section we present three simple examples of metamorphisms, or repre- 
sentation changers in the form of unfolds after folds. These three represent the 
entire spectrum of possibilities: it turns out that the first permits streaming au- 
tomatically (assuming lazy evaluation), the second does so with some work, and 
the third does not permit it at all. 



3.1 Reformatting Lines 



The classic application of metamorphisms is for dealing with structure clashes 
[22]: data is presented in a format that is inconvenient for a particular kind 
of processing, so it needs to be rearranged into a more convenient format. For 
example, a piece of text might be presented in 70-clraracter lines, but required 
for processing in 60-character lines. Rather than complicate the processing by 
having to keep track of where in a given 70-character line a virtual 60-character 
line starts, good practice would be to separate the concerns of rearranging the 
data and of processing it. A control- oriented or imperative view of this task 
can be expressed in terms of coroutines: one coroutine, the processor, repeatedly 
requests the other, the rearranger, for the next 60-character line. A data-oriented 
or declarative view of the same task consists of describing the intermediate data 
structure, a list of 60-clraracter lines. With lazy evaluation, the two often turn 
out to be equivalent; but the data-oriented view may be simpler, and is certainly 
the more natural presentation in functional programming. 

We define the following Haskell functions. 



reformat :: Integer — > [[a]] — > [[a]] 
reformat n = writeLines n ■ readLines 



readLines :: [[a]] — > [a] 

readLines = foldr (-H-) [] 

writeLines :: Integer — > [a] — > [[a]] 

writeLines n = unfoldr ( split n) where split n [] = Nothing 

split n x = Just ( splitAt n x) 



The function readLines is just wlrat is called concat in the Haskell standard 
prelude; we have written it explicitly as a fold here to emphasize the program 
structure. The function writeLines n partitions a list into segments of length n, 
the last segment possibly being short. (The operator denotes function com- 
position, and ‘-H-’ is list concatenation.) 

The function reformat fits our definition of a metamorplrism, since it consists 
of an unfold after a fold. Because 4f is non-strict in its right-hand argument, 
reformat is automatically streaming when lazily evaluated: the first lines of out- 
put can be produced before all the input has been consumed. Thus, we need 
not maintain the whole concatenated list (the result of readLines) in memory at 
once, and we can even reformat infinite lists of lines. 
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3.2 Radix Conversion 

Converting fractions from one radix to another is a change of representation. We 
define functions radixConvert , fromBase and toBase as follows: 

radixConvert :: (Integer , Integer) — * [ Integer \ — > [ Integer \ 

radixConvert (b, b') = toBase b' - fromBase b 

fromBase :: Integer — ■> [ Integer \ — > Rational 

fromBase b = foldr stepb 0 

toBase :: Integer — » Rational — > [Integer] 

toBase b = unfoldr nea;tb 

where 

stepb n x = (x + n) 4- b 
nextb 0 = Nothing 

next}, x = Just (|_gj , y — \_y\ ) where y = b x x 

Thus, fromBase b takes a (finite) list of digits and converts it into a fraction; 
provided the digits are all at least zero and less than b, the resulting fraction 
will be at least zero and less than one. For example, 

fromBase 10 [2, 5] = stepio 2 (stepio 5 0) = % 

Then toBase b takes a fraction between zero and one, and converts it into a 
(possibly infinite) list of digits in base b. For example, 

toBase 2 (^fa) = 0 : unfoldr next 2 (V 2 ) =0:1: unfoldr next 2 0 = [0, 1] 

Composing fromBase for one base with toBase for another effects a change of 
base. 

At first blush, this looks very similar in structure to the reformatting example 
of Section 3.1. However, now the fold operator stepb is strict in its right-hand ar- 
gument. Therefore, fromBase b must consume its whole input before it generates 
any output — so these conversions will not work for infinite fractions, and even 
for finite fractions the entire input must be read before any output is generated. 

Intuitively, one might expect to be able to do better than this. For example, 
consider converting the decimal fraction [2,5] to the binary fraction [0,1]. The 
initial 2 alone is sufficient to justify the production of the first bit 0 of the output: 
whatever follows (provided that the input really does consist of decimal digits), 
the fraction lies between 2 /io and 3 /io, and so its binary representation must start 
with a zero. We make this intuition precise in Section 4; it involves, among other 
steps, inverting the structure of the traversal of the input by replacing the foldr 
with a fold I . 

Of course, digit sequences like this are not a good representation for fractions: 
many useful operations turn out to be uncomputable. In Section 5, we look at a 
better representation. It still turns out to leave some operations uncomputable 
(as any non-redundant representation must), but there are fewer of them. 
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3.3 Heapsort 

As a third introductory example, we consider tree-based sorting algorithms. One 
such sorting algorithm is a variation on Hoare’s Quicksort [19]. What makes 
Quicksort particularly quick is that it is in-place, needing only logarithmic ex- 
tra space for the control stack; but it is difficult to treat in-place algorithms 
functionally, so we ignore that aspect. Structurally, Quicksort turns out to be 
a lrylomorphism: it unfolds the input list by repeated partitioning to produce a 
binary search tree, then folds this tree to yield the output list. 

We use the datatype of binary trees from Section 2. We also suppose functions 

partition :: [a] — > Maybe (a, [a], [a]) 
join :: Maybe (a, [a], [a]) — * [a] 

The first partitions a non-empty list into a pivot and the smaller and larger 
elements (or returns Nothing given an empty list); the second concatenates a 
pair of lists with a given element in between (or returns the empty list given 
Nothing); we omit the definitions for brevity. Given these auxiliary functions, we 
have 



quicksort = foldt join • unfoldt partition 
as a lrylomorphism. 

One can sort also as a tree metamorphism: the same type of tree is an in- 
termediate data structure, but this time it is a minheap rather than a binary 
search tree: the element stored at each node is no greater than any element in 
either child of that node. Moreover, this time the tree producer is a list fold and 
the tree consumer is a list unfold. 

We suppose functions 

insert :: a — > Tree a — > Tree a 
splitMin :: Tree a — > Maybe (a, Tree a) 

The first inserts an element into a heap; the second splits a heap into its least 
element and the remainder (or returns Nothing, given the empty heap). Given 
these auxilliary functions, we have 

heapsort = unfoldr splitMin ■ foldr insert (Node Nothing) 

as a metamorplrism. (Contrast this description of heapsort with the one given 
by Augusteijn [1] in terms of hylomorphisms, driving the computation by the 
shape of the intermediate tree rather than the two lists.) 

Here, unlike in the reformatting and radix conversion examples, there is no 
hope for streaming: the second phase cannot possibly make any progress until 
the entire input is read, because the first element of the sorted output (which 
is the least element of the list) might be the last element of the input. Sorting 
is inherently a memory-intensive process, and cannot be performed on infinite 
lists. 
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4 Streaming 

Of the three examples in Section 3, one automatically permits streaming and one 
can never do; only one, namely radix conversion, warrants further investigation 
in this regard. As suggested in Section 3.2, it ought to be possible to produce 
some of the output before all of the input is consumed. In this section, we see 
how this can be done, developing some general results along the way. 



4.1 The Streaming Theorem 

The second phase of the metamorplrism involves producing the output, main- 
taining some state in the process; that state is initialized to the result of folding 
the entire input, and evolves as the output is unfolded. Streaming must involve 
starting to unfold from an earlier state, the result of folding only some initial 
part of the input. Therefore, it is natural to consider metamorphisms in which 
the folding phase is an instance of foldl: 

unfoldr/ • foldl g c 

Essentially the problem is a matter of finding some kind of invariant of this 
state that determines the initial behaviour of the unfold. This idea is captured 
by the following definition. 

Definition 1. The streaming condition for f and g is: 
f c = Just (6, c') => f {g c a) — Just (6, g c' a) 
for all a, b, c and c' . 

Informally, the streaming condition states the following: if c is a state from which 
the unfold would produce some output element (rather than merely the empty 
list), then so is the modified state gca for any a; moreover, the element b output 
from c is the same as that output from gca, and the residual states c! and g c' a 
stand in the same relation as the starting states c and gca. In other words, ‘the 
next output produced’ is invariant under consuming another input. 

This invariant property is sufficient for the unfold and the fold to be fused into 
a single process, which alternates (not necessarily strictly) between consuming 
inputs and producing outputs. We define: 

stream :: (7 — » Maybe (/?, 7)) — > (7 — > a —> 7) — » 7 — » [a] — » [/?] 

stream f g c x = case / c of 

Just (&, c') — » b : stream f g c! x 
Nothing — » case x of 

a : x' — > stream f g (g c a) x' 

[] -[] 

Informally, stream f g :: 7 — > [a] — > [/?] involves a producer / and a consumer g\ 
maintaining a state c, it consumes an input list x and produces an output list y. 
If / can produce an output element b from the state c, this output is delivered 
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and the state revised accordingly. If / cannot, but there is an input a left, this 
input is consumed and the state revised accordingly. When the state is ‘wrung 
dry’ and the input is exhausted, the process terminates. 

Formally, the relationship between the metamorphism and the streaming 
algorithm is given by the following theorem. 

Theorem 2 (Streaming Theorem [4]). If the streaming condition holds for 
f and g, then 

stream f g c x = unfoldr/ (foldl g c x) 
on finite lists x . 

Proof. The proof is given in [4]. We prove a stronger theorem (Theorem 4) later. 

Note that the result relates behaviours on finite lists only: on infinite lists, the 
foldl never yields a result, so the metamorphism may not either, whereas the 
streaming process can be productive — indeed, that is the main point of intro- 
ducing streaming in the first place. 

As a simple example, consider the functions unCons and snoc, defined as 
follows: 

unCons [] = Nothing 

unCons (a : x) = Just (a, x) 

snoc x a = x -H- [a] 

The streaming condition holds for unCons and snoc : unCons x = Just (b,x') 
implies unCons ( snoc x a) = Just (b, snoc x’ a). Therefore, Theorem 2 applies, 
and 



unfoldr unCons ■ foldl snoc [] = stream unCons snoc [] 

on finite lists (but not infinite ones!). The left-hand side is a two-stage copying 
process with an unbounded intermediate buffer, and the right-hand side a one- 
stage copying queue with a one-place buffer. 

4.2 Reversing the Order of Evaluation 

In order to make a streaming version of radix conversion, we need to rewrite 
fromBase b as an instance of foldl rather than of foldr. Fortunately, there is a 
standard technique for doing this: 

foldr/ b = applyto b ■ foldr (•) id • map / 

where applyto bf=fb. Because composition is associative with unit id, the foldr 
on the right-hand side can - by the First Duality Theorem [6] be replaced 

by a foldl. 

Although valid, this is not always very helpful. In particular, it can be quite 
inefficient — the fold now constructs a long composition of little functions of the 




150 Jeremy Gibbons 



form f a, and this composition typically cannot be simplified until it is eventually 
applied to b. However, it is often possible that we can find some representation of 
those little functions that admits composition and application in constant time. 
Reynolds [34] calls this transformation defunctionalization. 

Theorem 3. Given fold arguments f :: a — ► j3 — •> (3 and b :: (3, suppose there is 
a type p of representations of functions of the form f a and their compositions, 
with the following operations: 

— a representation function rep :: a — > p (so that rep a is the representation of 
f a); 

— an abstraction function abs : : p — ■> (3 — > (3, such that abs ( rep a) = f a; 

— an analogue © :: p — > p — > p of function composition, such that abs (r © s) = 
abs r • abs s; 

— an analogue ident :: p of the identity function, such that abs ident = id; 

— an analogue appb :: p — > (3 of application to b, such that appb r = abs r b. 

Then 

foldr/ b = appb ■ fold I (©) ident • map rep 
The fold I and the map can be fused: 
foldr / b = appb ■ fold I (©) ident 
where r © a = r © rep a. 

(Note that the abstraction function abs is used above only for stating the cor- 
rectness conditions; it is not applied anywhere.) 

For example, let us return to radix conversion, as introduced in Section 3.2. 
The ‘little functions’ here are of the form stepb n, or equivalently, (-b&) • (+n). 
This class of functions is closed under composition: 

( step c n • stepb m) x 
= {composition} 
step c n ( stepb m x) 

= {step} 

{[x + m) -b b + n) c 
= {arithmetic} 

(x + m + b x n) (6 x c) 

= {composition} 

((-r-(b x c)) • (+m + b x n)) x 

We therefore defunctionalize stepb n to the pair (n, 6), and define: 

repb n = (n, b) 

abs (n, b) x = {x + n) -4- b 

(n, c) © (m, b) = (m + b x n, b x c) 

(n, c) ®b m = (n, c) © repb m = (m + b x n, b x c) 

ident = (0, 1) 

app (n, b) = abs (n, b) 0 = n 4- b 

Theorem 3 then tells us that 

fromBase b = app ■ foldl (©!,) ident 
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4.3 Checking the Streaming Condition 

We cannot quite apply Theorem 2 yet, because the composition of toBase V 
and the revised fromBase b has the abstraction function app between the unfold 
and the fold. Fortunately, that app fuses with the unfold. For brevity below, we 
define 

mapl f Nothing = Nothing 
mapl f (Just (a, b )) = Just (a,f b) 

(that is, mapl is the map operation of the base functor for the list datatype); 
then 

unfold r next c ■ app = unfold r nextapp c 
<= {unfold fusion} 

next c • app = mapl app ■ nextapp c 



and 

next c ( app (n, r)) 

= {app, nextc', let u = \ n x c 4- rj } 

if n== 0 then Nothing else Just (u, nxc^r — u) 

= {app\ there is some leeway here (see below)} 

if n-- 0 then Nothing else Just [u, app (n — uxr^c, r4-c)) 

= {mapl} 

mapl app (if n= 0 then Nothing else Just (u, (n — uxr+c, r4-c))) 
Therefore we try defining 

nextappc (n, r) = if n== 0 then Nothing else Just (u, (n — uxr^-c, r-i-c)) 
where u = }n x c -h- r\ 

Note that there was some leeway here: we had to partition the rational n x c-7-r—u 
into a numerator and denominator, and we chose (n — wxr-r-c, r-r-c) out of the 
many ways of doing this. One might perhaps have expected (nxc — uxr,r) 
instead; however, this leads to a dead-end, as we show later. Note that our 
choice involves generalizing from integer to rational components. 

Having now massaged our radix conversion program into the correct format: 

radixConvert (6, c) = unfoldr nextapp c ■ fold I (©{,) ident 

we may consider whether the streaming condition holds for nextapp c and ©&; 
that is, whether 

nextappc ( n,r ) = Just (u, [n' , r')) 

=> 

nextappc ((n, r) ®i m) = Just (u, (n', r') ©(, m) 

An element u is produced from a state (n, r) iff n 0, in which case u = 
[n x c-7- rj . The modified state (n, r) ®b m evaluates to (m+b x n,b x r). Since 
n, b > 0 and m > 0, this necessarily yields an element; this element v equals 
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[ (m + b x n) x c -r- (b x r) J. We have to check that u and v are equal. Sadly, 
in general they are not: since 0 < m < b, it follows that v lies between u and 
L(n + 1) x c -r rj , but these two bounds need not meet. 

Intuitively, this can happen when the state has not completely determined 
the next output, and further inputs are needed in order to make a commitment 
to that output. For example, consider having consumed the first digit 6 while 
converting the sequence [6,7] in decimal (representing 0.67io) to ternary. The 
fraction O.610 is about O.I2IO3; nevertheless, it is not safe to commit to producing 
the digit 1, because the true result is greater than 0.2 3 , and there is not enough 
information to decide whether to output a 1 or a 2 until the 7 has been consumed 
as well. 

This is a common situation with streaming algorithms: the producer function 
( nextapp above) needs to be more cautious when interleaved with consumption 
steps than it does when all the input has been consumed. In the latter situation, 
there are no further inputs to invalidate a commitment made to an output; but 
in the former, a subsequent input might invalidate whatever output has been 
produced. The solution to this problem is to introduce a more sophisticated 
version of streaming, which proceeds more cautiously while input remains, but 
switches to the normal more aggressive mode if and when the input is exhausted. 
That is the subject of the next section. 

4.4 Flushing Streams 

The typical approach is to introduce a ‘restriction’ 
snextapp = guard safe nextapp 
of nextapp for some predicate safe , where 

guard p f x = if p x then f x else Nothing 

and to use snextapp as the producer for the streaming process. In the case of 
radix conversion, the predicate safe c (dependent on the output base c) could be 
defined 

safe c (n, r) = (|_n x c 4- rj == | (n + 1) x c 4- rj) 

That is, the state (n, r) is safe for the output base c if these lower and upper 
bounds on the next digit meet; with this proviso, the streaming condition holds, 
as we checked above. (In fact, we need to check not only that the same elements 
are produced from the unmodified and the modified state, but also that the two 
residual states are related in the same way as the two original states. With the 
definition of nextapp c that we chose above, this second condition does hold; with 
the more obvious definition involving (nxc — uxr,r) that we rejected, it does 
not.) 

However, with this restricted producer the streaming process no longer has 
the same behaviour on finite lists as does the plain metamorphism: when the 
input is exhausted, the more cautious snextapp may have left some outputs 
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still to be produced that the more aggressive nextapp would have emitted. The 
streaming process should therefore switch into a final ‘flushing’ phase when all 
the input has been consumed. 

This insight is formalized in the following generalization of stream: 

/ stream :: (7— > Maybe (7,/?)) — > (7— >a— >7) — * (7— >[/ 3 ]) — > 7 — > [a] — > \(3\ 

/ stream / g h c x = case / c of 

Just (6, c') — > b : /stream f g h c' x 
Nothing — > case x of 

a : x' — > /stream / g h (g c a) x' 

[] -> h c 

The difference between /stream and stream is that the former has an extra ar- 
gument, h, a ‘flusher’; when the state is wrung as dry as it can be and the input 
is exhausted, the flusher is applied to the state to squeeze the last few elements 
out. This is a generalization, because supplying the trivial flusher that always 
returns the empty list reduces /stream to stream. 

The relationship of metamorphisms to flushing streams is a little more com- 
plicated than that to ordinary streams. One way of expressing the relationship 
is via a generalization of unfoldr, whose final action is to generate a whole tail 
of the resulting list rather than the empty list. This is an instance of primi- 
tive corecursion (called an apomorphism by Vene and Uustalu [37]), which is 
the categorical dual of primitive recursion (called a paramorphism by Meertens 
[29]). 

apol :: (/? — > Maybe ( a,/3 )) — > (/3 — > [a]) — > /3 — > [a] 

apol / h b = case / b of 

Just (a, b') — » a : apol f h. b' 

Nothing — > h b 

Informally, apol/ h b = unfoldr/ b-W-hV , where b' is the final state of the unfold (if 
there is one — and if there is not, the value of h b' is irrelevant), and unfoldr/ = 
apol / (const []). On finite inputs, provided that the streaming condition holds, a 
flushing stream process yields the same result as the ordinary streaming process, 
but with the results of flushing the final state (if any) appended. 

Theorem 4 (Flushing Stream Theorem). 1/ the streaming condition holds 
/or / and g , then 

/stream / g h c x = apol / h (fold I g c x) 
on finite lists x . 

The proof uses the following lemma [4] , which lifts the streaming condition from 
single inputs to finite lists of inputs. 

Lemma 5 . // the streaming condition holds /or / and g , then 
/ c = Just (6, c') => / (foldl g c x) = Just (6, foldl g c' x) 

/or all b, c, c! and finite lists x. 
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It also uses the approximation lemma [5,15]. 

Lemma 6 (Approximation Lemma). For finite, infinite or partial lists x 
and y, 

x = y = Vn. approx n x = approx n y 

where 

approx :: Int — > [a] — » [a] 

approx (n + 1) [] = [] 

approx (n + 1) (a : x) = a : approx n x 

(Note that approx 0 x = J_ for any x, by case exhaustion.) 

Proof (of Theorem 4)- By Lemma 6, it suffices to show, for fixed /, g, h and for 
all n and finite x, that 

Vc. approx n ( f stream f g h cx) = approx n (apol f h (foldl g c x)) 

under the assumption that the streaming condition holds for / and g. We use a 
‘double induction’ simultaneously over n and the length ffx of x. The inductive 
hypothesis is that 

Vc. approx m ( fstream f g h c y) = approx m (apol / h (foldl g c y)) 

for any m, y such that m < nAffy < ffx or m < nAffy < ffx. We then proceed 
by case analysis to complete the inductive step. 

Case / c = Just (6, d). In this case, we make a subsidiary case analysis on n. 

Subcase n = 0. Then the result holds trivially. 

Subcase n = n' + 1. Then we have: 

approx (n 7 + 1) (apol f h (foldl g c x)) 

= {Lemma 5: / (foldl g c x) = Just (b, foldl g d x)} 
approx (n 7 + 1) (b : apol f h (foldl g d a;)) 

= {approx} 

b : approx n' (apol f h (foldl g dx)) 

= {induction: n' < n} 

b : approx n' ( fstream f g h d x) 

= {approx} 

approx (n 7 + 1) (6 : fstream f g h d x) 

= { fstream ; case assumption} 

approx (n 7 + 1) {fstream f g h c x) 

Case / c = Nothing. In this case, we make a subsidiary case analysis on x. 

Subcase x = a : x'. Then 

apol / h (foldl g c (a : x')) 

= {foldl} 

apol / h (foldl g {g c a) a; 7 )) 

= {induction: ffx’ < ffx} 
fstream f g h {g c a) x' 

= { fstream ; case assumption} 

fstream f g h c {a : x') 
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Subcase x = []. Then 

apol f h (foldl g c []) 

= {foldl} 
apol / h c 

= {case assumption} 
h c 

= {/stream; case assumption} 
f stream f g h c [ ] 



4.5 Invoking the Flushing Stream Theorem 

Theorem 4 gives conditions under which an apomorphism applied to the result 
of a foldl may be streamed. This seems of limited use, since such scenarios are 
not commonly found. However, they can be constructed from more common 
scenarios in which the apomorphism is replaced with a simpler unfold. One way 
is to introduce the trivial apomorphism, whose flusher always returns the empty 
list. A more interesting, and the most typical, way is via the observation that 

apol (guard p /) (unfoldr/) = unfoldr/ 

for any predicate p. Informally, the work of an unfold can be partitioned into 
‘cautious’ production, using the more restricted producer guard p f, followed by 
more ‘aggressive’ production using simply / when the more cautious producer 
blocks. 



4.6 Radix Conversion as a Flushing Stream 

Returning for a final time to radix conversion, we define 
snextappc (n, r) = guard safe c nextapp c 

We verified in Sections 4.3 and 4.4 that the streaming condition holds for 
snextapp c and ®b- Theorem 4 then tells us that we can convert from base b 
to base c using 

radixConvert (6, c) = J stream snextapp c (©j) (unfoldr nextapp c ) (0, 1) 

This program works for finite or infinite inputs, and is always productive. (It 
does, however, always produce an infinite result, even when a finite result would 
be correct. For example, it will correctly convert from base 10 to base 2, but 
in converting from base 10 to base 3 it will produce an infinite tail of zeroes. 
One cannot really hope to do better, as returning a finite output depending on 
the entire infinite input is uncomputable.) 
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5 Continued Fractions 



Continued fractions are finite or infinite constructions of the form 



ao 



bi 



ai 



b 2 + 



a 2 



in which all the coefficients are integers. They provide an elegant representation 
of numbers, both rational and irrational. They have therefore been proposed 
by various authors [2,17,24,38,26,27] as a good format in which to carry out 
exact real arithmetic. Some of the algorithms for simple arithmetic operations on 
continued fractions can be seen as metamorphisms, and as we shall show here, 
they can typically be streamed. 

We consider algorithms on regular continued fractions: ones in which all the di 
coefficients are 1, and all the bi coefficients (except perhaps bo) are at least 1. We 
denote regular continued fractions more concisely in the form ( bo , b\, b 2 , . . .). For 
example, the continued fraction for tt starts (3, 7, 15, 1, 292, . . .). Finite continued 
fractions correspond to rationals; infinite continued fractions represent irrational 
numbers. 



5.1 Converting Continued Fractions 

We consider first conversions between rationals and finite regular continued frac- 
tions. To complete the isomorphism between these two sets, we need to augment 
the rationals with % = oo, corresponding to the empty continued fraction. We 
therefore introduce a type ExtRat of rationals extended with oo. 

Conversion from rationals to continued fractions is straightforward. Infinity, 
by definition, is represented by the empty fraction. A finite rational % has a first 
term given by a div b , the integer obtained by rounding the fraction down; this 
leaves a remainder of (a mod b)/b, whose reciprocal is the rational from which 
to generate the remainder of the continued fraction. Note that as a consequence 
of rounding the fraction down to get the first term, the remainder is between 
zero and one, and its reciprocal is at least one; therefore the next term (and by 
induction, all subsequent terms) will be at least one, yielding a regular continued 
fraction as claimed. 

type CF = [Integer] 

toCF :: ExtRat — » CF 

toCF = un\o\Ar get 

where get x = if x==oo 

then Nothing 

else Just([a;J, 1 / (a: _ Lx j)) 

Converting in the opposite direction is more difficult: of course, not all con- 
tinued fractions correspond to rationals. However, finite ones do, and for these 
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we can compute the rational using a fold — it suffices to fold with the inverse 
of get (or at least, what would have been the inverse of get, if foldr had been 
defined to take an argument of type Maybe (a,/3) — » (3, dualizing unfoldr). 

fromCF :: CF — > ExtRat 

fromCF = foldr put oo where put n y = n + 1 / y 

Thus, fromCF -to CF is the identity on extended rationals, and toCF -fromCF 
is the identity on finite continued fractions. On infinite continued fractions, 
fromCF yields no result: put is strict, so the whole list of coefficients is required. 
One could compute an infinite sequence of rational approximations to the ir- 
rational value represented by an infinite continued fraction, by converting to a 
rational each of the convergents. But this is awkward, because the fold starts at 
the right, and successive approximations will have no common subexpressions 
it does not constitute a scan. It would be preferable if we could write fromCF 
as an instance of fold I ; then the sequence of approximations would be given as 
the corresponding scanl. 

Fortunately, Theorem 3 comes to the rescue again. This requires defunction- 
alizations of functions of the form put n and their compositions. For proper 
rationals, we reason: 

put n ( put m % ) 

= {put} 

put n (m + b / a ) 

= {arithmetic} 
putn{ mxa+b /a) 

= {put} 

Tl “1“ /rax a-\-b 

= {arithmetic} 

(n x (m x a + b) + a)/ (m x a + b) 

= {collecting terms; dividing through by b} 

((n x m + 1) x % + n)/ ( m x a / b + 1) 

This is a ratio of integer-coefficient linear functions of a / b , sometimes known as 
a rational function or linear fractional transformation of a / b . The general form 
of such a function takes x to (q x + r)/(sx + t) (denoting multiplication by 
juxtaposition for brevity), and can be represented by the four integers q, r, s , t. 

For the improper rational oo, we reason: 

put n ( put m oo) 

= {put} 

put n(m + Voo) 

= {7oo = o} 
put n m 
= {put} 
n + Vm 

= {arithmetic} 

(n x m + l)/m 




158 Jeremy Gibbons 



which agrees with the result for proper rationals, provided we take the reasonable 
interpretation that (q x % + r)/(s x % + t) = qxa+rxb / s xa+txb when b = 0. 

Following Theorem 3, then, we choose four-tuples of integers as our repre- 
sentation; for reasons that will become clear, we write these four-tuples in the 
form ( q }) . The abstraction function abs applies the rational function: 

abs ( q r t ) x = qx+r /sx+t 

and the representation function rep injects the integer n into the representation 
of put n: 

rep n = (" J) 

The identity function is represented by ident : 
ident = (J °) 

We verify that rational functions are indeed closed under composition, by con- 
structing the representation of function composition: 



= {requirement} 

r t )^s( q s : r t :)x) 

= {a&s} 

abs (l r t ){{ q 'x + r')/{s'x + t')) 

= { abs again} 

{q \q' x+r') + r (s' x+t'))/ (s (q r x+r') + t (s' x+t')) 

= {collecting terms} 

{{q q'+r s') x + (q r'+r t'))/((s q'+t s') x + (s r'+t t')) 
= {a&s} 

a ^ s (q q' + r s’ qr'+rt'\ 

UUb V s q' + t s' s r' + t t' > X 

We therefore define 



( q r) (T ( q' r' \ g_ f q q' + r s' q r'+r t' \ 
\ s t ) - 1 \ s' t' ) Vs q'+t s' s r' + t + ) 

Finally, we define an extraction function 

app( q s r t )=abs( q s }) oo 
= % 



(Notice that © turns out to be matrix multiplication, and ident the unit ma- 
trix, which explains the choice of notation. These matrices are sometimes called 
homographies, and the rational functions they represent homographic functions 
or Mobius transformations. They can be generalized from continued fractions to 
many other interesting exact representations of real numbers [32], including re- 
dundant ones. In fact, the same framework also encompasses radix conversions, 
as explored in Section 3.2.) 

By Theorem 3 we then have 

fromCF = app ■ foldl (©) ident where (® {) © n = q ) 
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Of course, this still will not work for infinite continued fractions; however, we 
can now define 

fromCFi :: CF — > [ExtRat] 
fromCFi = map app • scanl (©) ident 

yielding the (infinite) sequence of finite convergents of an (infinite) continued 
fraction. 

5.2 Rational Unary Functions of Continued Fractions 

In Section 5.1, we derived the program 
fromCF = app ■ fold I (©) (J °) 

for converting a finite continued fraction to an extended rational. In fact, we 
can compute an arbitrary rational function of a continued fraction, by starting 
this process with an arbitrary homography in place of the identity ( J J ) . This 
is because composition © fuses with the fold: 

abs h ( fromCF ns) 

= {fromCF} 

abs h ( app (foldl (©) ident ns)) 

= {specification of app} 

abs h ( abs (foldl (©) ident ns) oo) 

= {requirement on abs and ©} 
abs ( h 0 foldl (©) ident ns) oo 
= {fold fusion: © is associative, and ident its unit} 
abs (foldl (©) h ns) oo 
= {specification of app again} 
app (foldl (©) h ns) 

For example, suppose we want to compute the rational 2 / x - 3 > where x is 
the rational represented by a particular (finite) continued fraction ns. We could 
convert ns to the rational x, then perform the appropriate rational arithmetic. 
Alternatively, we could convert ns to a rational as above, starting with the 
homography (° _ 3 ) instead of (* 1 ), and get the answer directly. If we want 
the result as a continued fraction again rather than a rational, we simply post- 
apply toCF. 

Of course, this will not work to compute rational functions of infinite contin- 
ued fractions, as the folding will never yield a result. Fortunately, it is possible 
to applying streaming, so that terms of the output are produced before the 
whole input is consumed. This is the focus of the remainder of this section. The 
derivation follows essentially the same steps as were involved in radix conversion. 

The streaming process maintains a state in the form of a homography, which 
represents the mapping from what is yet to be consumed to what is yet to be 
produced. The production steps of the streaming process choose a term to out- 
put, and compute a reduced homography for the remainder of the computation. 
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Given a current homography ( q r ) , and a chosen term n, the reduced lromog- 
raphy ( q , T, ) is determined as follows: 

(q x + r)/(s x + t) = n + 1/ ((?' a; + r')/ (s' x + t')) 

= {reciprocal} 

(g a; + r)/(s a; + t) = n + (s' x + t')/(q' x + r') 

= {rearrange} 

(s' x + t')/(q' x + r') = (qx + r)/(s x + t) — n 
= {incorporate n into fraction} 

(s' x + t')/(q' x + r') = (qx + r — n(s x + t))/(s x + t) 

= {collect x and non- a; terms} 

(s' x + t')/(q' x + r') = ((q — n s) x + r — nt)/(s x + t) 

•£= {equating terms} 

q' = s,r' = t, s' = q — n s,t' = r — nt 

That is, 

(i ;.') = (? o 

We therefore define 

email ;)»=(? .;>©(: 

s t 

q — n s r—n t 

Making It a Metamorphism. In most of what follows, we assume that we 
have a completely regular continued fraction, namely one in which every coeffi- 
cient including the first is at least one. This implies that the value represented 
by the continued fraction is between one and infinity. We see at the end of the 
section what to do about the first coefficient, in case it is less than one. 

Given the representation of a rational function in the form of a homography 
h, we introduce the function rfc (‘rational /unction of a completely regular 
continued fraction’) to apply it as follows: 

rfc h = toCF ■ app ■ foldl (©) h 

This is almost a metamorplrism: toCF is indeed an unfold, but we must get rid of 
the projection function app in the middle. Fortunately, it fuses with the unfold: 

unfoldr get ■ app = unfoldr geth 

where geth (for l get on homographies’) is defined by 

geth ( q r ) = if s==0 then Nothing else Just (n, emit ( q }) n) 
where n = q div s 

as can easily be verified. 

This yields a metamorphism: 





rfc h = unfoldr geth ■ foldl (©) h 
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Checking the Streaming Condition. Now we must check that the streaming 
condition holds for geth and ©. We require that when 

geth h = Just ( n , h') 

then, for any subsequent term m (which we can assume to be at least 1, this 
being a completely regular continued fraction), 

geth (h® m) = Just (n, b! © m) 

Unpacking this, when h = ( 9 r t ) and h' = ( 9 , ), we have s / 0, n = qdiv_s, 

q' = s, and s' = a mod s; moreover, r t ) ®m = (™ q s )- We require among 
other things that ms + i/ 0 and (m q + r) div (m s + t) = q div s. Sadly, this 
does not hold; for example, if m = 1 and s, t are positive, 

q+r / s +t < 1 + q /s = s(q + r) < (q + s) (q + t) = r s < q t + s t + s 2 

which fails if r is sufficiently large. 

Cautious Progress. As with the radix conversion algorithm in Section 4.3, 
the function that produces the next term of the output must be more cautious 
when it is interleaved with consumption steps that it may be after all the input 
has been consumed. The above discussion suggests that we should commit to an 
output only when it is safe from being invalidated by a later input; in symbols, 
only when (m q + r) div ( m s + t) = q div s for any m > 1. This follows if s and 
t are non-zero and have the same sign, and if (q + r) div (s + t) — q div s, as a 
little calculation will verify. 

(Another way of looking at this is to observe that the value represented by 
a completely regular continued fraction ranges between 1 and oo, so the result 
of transforming it under a lromography ( q ) ranges between 

°ms ;)i = ,+ 7.« 

and 

abs( q ;)oo = % 

if s ^ 0. If the two denominators have the same sign, the result ranges between 
these two; if they have different signs, it ranges outside them. Therefore, the 
first coefficient of the output is determined if the denominators have the same 
sign (which follows if s and t are non-zero and of the same sign) and the two 
fractions have the same integer parts.) 

We therefore define gets (for ‘safe get') by 

gets ( q r .) = let n = q div s in 

if s t > 0 A (q + r) div_ (s + t) == n 
then Just (n, emit ( q r t ) n) 

else Nothing 

Note that whenever gets produces a value, geth produces the same value; but 
sometimes gets produces nothing when geth produces something. The streaming 
condition does hold for gets and ©, as the reader may now verify. 
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Flushing Streams. It is not the case that unfoldr gef • app = unfoldr gets, of 
course, because the latter is too cautious. However, it does follow that 

unfoldr get ■ app = apol gets (unfoldr geth) 

This cautiously produces elements while it it safe to do so, then throws caution 
to the winds and produces elements anyway when it ceases to be safe. Moreover, 
Theorem 4 applies to the cautious part, and so 

rfc h = unfoldr get ■ app • foldl (©) h 

= fstream gets (©) (unfoldr geth) h 

This streaming algorithm can compute a rational function of a finite or infi- 
nite completely regular continued fraction, yielding a finite or infinite regular 
continued fraction as a result. 

Handling the First Term. A regular but not completely regular continued 
fraction may have a first term of 1 or less, invalidating the reasoning above. 
However, this is easy to handle, simply by consuming the first term immediately. 
We introduce a wrapper function rf: 

rf h [] = r fch[] 
rf h (n : x) = rfc {h © n) x 

This streaming algorithm can compute any rational function of a finite or infinite 
regular continued fraction, completely regular or not. 

5.3 Rational Binary Functions of Continued Fractions 

The streaming process described in Section 5.2 allows us to compute a unary 
rational function (ax+b)/(cx+d) ofasingle continued fraction a;. The technique 
can be adapted to allow a binary rational function (axy + bx + cy+d)/(exy + 
f x + g y + h) of continued fractions x and y. This does not fit into our framework 
of metamorphisms and streaming algorithms, because it combines two arguments 
into one result; nevertheless, much of the same reasoning can be applied. We 
intend to elaborate on this in a companion paper. 

6 Two Other Applications 

In this section, we briefly outline two other applications of streaming; we have 
described both in more detail elsewhere, and we refer the reader to those sources 
for the details. 

6.1 Digits of 7 r 

In [14] we present an unbounded spigot algorithm for computing the digits of 7 r. 
This work was inspired by Rabinowitz and Wagon [33], who coined the term 
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spigot algorithm for an algorithm that yields output elements incrementally and 
does not reuse them after they have been output — so the digits drip out one by 
one, as if from a leaky tap. (In contrast, most algorithms for computing approx- 
imations to 7 r, including the best currently known, work inscrutably until they 
deliver a complete response at the end of the computation.) Although incremen- 
tal, Rabinowitz and Wagon’s algorithm is bounded , since one needs to decide at 
the outset how many digits are to be computed, whereas our algorithm yields 
digits indefinitely. (This is nothing to do with evaluation order: Rabinowitz and 
Wagon’s algorithm is just as bounded in a lazy language.) 

The algorithm is based on the following expansion: 

(i!) 2 2 i+1 





A streaming algorithm can convert this infinite sequence of linear fractional 
transformations (represented as lromographies) into an infinite sequence of dec- 
imal digits. The consumption operator is matrix multiplication, written 0 in 
Section 5.1. When a digit n is produced, the state h should be transformed into 

(o ~T)&h 

Any tail of the input sequence represents a value between 3 and 4, so lromography 
h determines the next digit when 



[ abs h 3J = 1 abs h 4J 



(in which case, the digit is the common value of these two expressions). This 
reasoning gives us the following program: 



pi = stream prod (©) ident Ifts 



where 



«*» £ K5 £K) i * MM] 

prod h A if [ abs h 4J -- n then Just (n, ( g 0 _1 1 °”) © h) else Nothing 
where n = [abs h 3J 



6.2 Arithmetic Coding 

Arithmetic coding [40] is a method for data compression. It can be more effective 
than rival schemes such as Huffman coding, while still being as efficient. More- 
over, it is well suited to adaptive encoding, in which the coding scheme evolves 
to match the text being encoded. 

The basic idea of arithmetic encoding is simple. The message to be encoded 
is broken into symbols, such as characters, and each symbol of the message 
is associated with a semi-open subinterval of the unit interval [0 .. 1). Encoding 
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starts with the unit interval, and narrows it according to the intervals associated 
with each symbol of the message in turn. The encoded message is the binary 
representation of some fraction chosen from the final ‘target’ interval. 

In [4] we present a detailed derivation of arithmetic encoding and decoding. 
We merely outline the development of encoding here, to show where streaming 
fits in. Decoding follows a similar process to encoding, starting with the unit 
interval and homing in on the binary fraction, reconstructing the plaintext in 
the process; but we will not discuss it here. 

The encoding process can be captured as follows. The type Interval represents 
intervals of the real line, usually subunits (subintervals of the unit interval): 

unit :: Interval 
unit = [0 .. 1) 

Narrowing an interval Ir by a subunit pq yields a subinterval of Ir , which stands 
in the same relation to Ir as pq does to unit. 

narrow :: Interval — > Interval — > Interval 

narrow [l .. r) \p .. q) = [l + ( r—l ) x p .. I + ( r—l ) x q) 

We consider only non-adaptive encoding here for simplicity: adaptivity turns 
out to be orthogonal to streaming. We therefore represent each symbol by a fixed 
interval. 

Encoding is a two-stage process: narrowing intervals to a target interval, and 
generating the binary representation of a fraction within that interval (missing 
its final 1). 

encode :: [ Interval ] — > [Bool] 

encode = unfoldr next.Bit. ■ foldl narrow unit 



where 

next.Bit (l, r) 

| r < x /2 = Just (False, narrow (0, 2) (l, r)) 

| 72 < l = Just (True, narrow (— 1 , 1 ) (l, r)) 

j l < V2 < r = Nothing 

This is a metamorphism. 

As described, this is not a very efficient encoding method: the entire message 
has to be digested into a target interval before any of the fraction can be gen- 
erated. However, the streaming condition holds, and bits of the fraction can be 
produced before all of the message is consumed: 

encode :: [Interval] — > [Bool] 
encode m = stream next.Bit narrow unit 

7 Future and Related Work 



The notion of metamorphisms in general and of streaming algorithms in partic- 
ular arose out of our work on arithmetic coding [4]. Since then, we have seen 
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the same principles cropping up in other areas, most notably in the context of 
various kinds of numeric representations: the radix conversion problem from Sec- 
tion 3.2, continued fractions as described in Section 5, and computations with 
infinite compositions of lromographies as used in Section 6.1. Indeed, one might 
even see arithmetic coding as a kind of numeric representation problem. 

7.1 Generic Streaming 

Our theory of metamorphisms could easily be generalized to other datatypes: 
there is nothing to prevent consideration of folds consuming and unfolds produc- 
ing datatypes other than lists. However, we do not currently have any convincing 
examples. 

Perhaps related to the lack of convincing examples for other datatypes, it is 
not clear what a generic theory of streaming algorithms would look like. List- 
oriented streaming relies essentially on foldl, which does not generalize in any 
straightforward way to other datatypes. (We have in the past attempted to 
show how to generalize scan I to arbitrary datatypes [9-11], and Pardo [31] has 
improved on these attempts; but we do not see yet how to apply those construc- 
tions here.) 

However, the unfold side of streaming algorithms does generalize easily, to 
certain kinds of datatype if not obviously all of them. Consider producing a data 
structure of the type 

data Generic r a = Gen (Maybe (a, r ( Generic r a))) 

for some instance r of the type class Functor. (Lists essentially match this pat- 
tern, with r the identity functor. The type Tree of internally-labelled binary 
trees introduced in Section 2 matches too, with r being the pairing functor. 
In general, datatypes of this form have an empty structure, and all non-empty 
structures consist of a root element and an T-shaped collection of children.) It 
is straightforward to generalize the streaming condition to such types: 

f c = Just (6, c') =$■ f (g c a) = Just ( b,fmap (Xu — > g u a) c') 

(This has been called an ‘r-invariant’ or ‘mongruence’ [23] elsewhere.) Still, we 
do not have any useful applications of an unfold to a Generic type after a foldl. 

7.2 Related Work: Back to Basics 

Some of the ideas presented here appeared much earlier in work of Hutton and 
Meijer [21], They studied representation changers , consisting of a function fol- 
lowed by the converse of a function. Their representation changers are analogous 
to our metamorphisms, with the function corresponding to the fold and the con- 
verse of a function to the unfold: in a relational setting, an unfold is just the 
converse of a fold, and so our metamorphisms could be seen as a special case of 
representation changers in which both functions are folds. We feel that restrict- 
ing attention to the special case of folds and unfolds is worthwhile, because we 
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can capitalize on their universal properties; without this restriction, one has to 
resort to reasoning from first principles. 

Hutton and Meijer illustrate with two examples: carry-save incrementing and 
radix conversion. The carry-save representation of numbers is redundant, using 
the redundancy to avoid rippling carries. Although incrementing such a number 
can be seen as a change of representation, it is a rather special one, as the point 
of the exercise is to copy as much of the input as possible straight to the output; 
it isn’t immediately clear how to fit that constraint into our pattern of folding 
to an abstract value and independently unfolding to a different representation. 
Their radix conversion is similar to ours, but their resulting algorithm is not 
streaming: all of the input must be read before any of the output is produced. 
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Abstract. This paper shows how probabilistic reasoning can be applied to the 
predicative style of programming. 



0 Introduction 

Probabilistic programming refers to programming in which the probabilities of the values of 
variables are of interest. For example, if we know the probability distribution from which 
the inputs are drawn, we may calculate the probability distribution of the outputs. We may 
introduce a programming notation whose result is known only probabilistically. A 
formalism for probabilistic programming was introduced by Kozen [3], and further developed 
by Morgan, Mclver, Seidel and Sanders [4], Their work is based on the predicate transformer 
semantics of programs; it generalizes the idea of predicate transformer from a function that 
produces a boolean result to a function that produces a probability result. The work of 
Morgan et al. is particularly concerned with the interaction between probabilistic choice and 
nondeterministic choice, which is required for refinement. 

The term “predicative programming” [0,2] describes programming according to a first- 
order semantics, or relational semantics. The purpose of this paper is to show how 
probabilistic reasoning can be applied to the predicative style of programming. 



1 Predicative Programming 

Predicative programming is a way of writing programs so that each programming step is 
proven as it is made. We first decide what quantities are of interest, and introduce a variable 
for each such quantity. A specification is then a boolean expression whose variables 
represent the quantities of interest. The term "boolean expression" means an expression of 
type boolean, and is not meant to restrict the types of variables and subexpressions, nor the 
operators, within a specification. Quantifiers, functions, terms from the application domain, 
and terms invented for one particular specification are all welcome. 

In a specification, some variables may represent inputs, and some may represent 
outputs. A specification is implemented on a computer when, for any values of the input 
variables, the computer generates (computes) values of the output variables to satisfy the 
specification. In other words, we have an implementation when the specification is true of 
every computation. (Note that we are specifying computations, not programs.) A 
specification S is implementable if 
Vo- 3a'- S 

where a = x, y, ... are the inputs and a' = x ' , y ' , ... are the outputs. In addition, 
specification S is deterministic if, for each input, the satisfactory output is unique. A 
program is a specification that has been implemented, so that a computer can execute it. 

Suppose we are given specification S . If S is a program, a computer can execute it. 
If not, we have some programming to do. That means building a program P such that 
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S<=P is a theorem; this is called refinement. Since S is implied by P . all computer 
behavior satisfying P also satisfies S . We might refine in steps, finding specifications 
R , Q , ... such that S<=R<=Q<=...t=P . 

2 Notation 

Here are all the notations used in this paper, arranged by precedence level. 



0. 


T 1 0 1 2 oo *y ( ) 


booleans, numbers, variables, bracketed expressions 


1. 


fx 


function application 


2. 


xy —y 


exponentiation, function space 


3. 


X / 


multiplication, division 


4. 


+ - © 


addition, subtraction, modular addition 


5. 




from (including) to (excluding) 


6. 


Al 

VI 

A 

V 

4 

II 


comparisons, inclusion 


7. 


— i 


negation 


8. 


A 


conjunction 


9. 


V 


disjunction 


10. 


=> <= 


implications 


11. 


:= if then else 


assignment, conditional composition 


12. 


V- 3- Z- ; 


quantifiers, sequential composition 


13. 


= => <= > 


equality, implications, comparison 



Exponentiation serves to bracket all operations within the exponent. The infix operators 
/ - associate from left to right. The infix operators x + © a v ; are associative (they 
associate in both directions). On levels 6, 10, and 13 the operators are continuing; for 
example, a=b=c neither associates to the left nor associates to the right, but means 
a=b a b=c . On any one of these levels, a mixture of continuing operators can be used. 
For example, a<b<c means a<b a b<c . The operators — ^ are identical 

to = => <= > except for precedence. 

We use unprimed and primed identifiers (for example, x and x ) for the initial and 
final values of a variable. We use ok to specify that all variables are unchanged. 
ok — x'=x a y’=y a ... 

The assignment notation x:= e specifies that x is assigned the value e and that all other 
variables are unchanged. 

x:= e — x’=e a y'=y a ... 

Conditional composition is defined as follows: 

if b then P else Q — (b => P) a (— ,b => Q) 

— b a P v —ib a Q 
Sequential composition is defined as follows: 

P\Q — 3o"- (substitute g" for o' in P ) a (substitute o" for o in Q) 
where o = x, y, ... are the initial values, a" = x", y" , ... are the intermediate values, and 
o' = x , y\ ... are the final values of the variables. There are many laws that can be proven 
from these definitions; one of the most useful is the Substitution Law: 
x:= e\P — (for x substitute e in P ) 

where P is a specification not employing the assignment or sequential composition 
operators. To account for execution time, we use a time variable; we use t for the time at 
which execution starts, and t' for the time at which execution ends. In the case of 
nontermination. t'=°° . 
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3 Example of Predicative Programming 

As an example of predicative programming, we write a program that cubes using only 
addition, subtraction, and test for zero. Let x and y be natural (non-negative integer 
valued) variables, and let n be a natural constant. Then x'=rii specifies that the final value 
of variable x is h3 . One way to refine (or implement) this specification is as follows: 
x'=n2 \ — x:= n; x'=xxn ; x'=xxn 

An initial assignment of n to x followed by two multiplications implies x’=rii . Now 
we need to refine x'=xxn . 

x'=xxn y:= x; x:= 0; x' = x + yxn 

This one is proven by two applications of the Substitution Law: in the specification at the 
right end x = x + yxn , first replace x by 0 and then replace y by x ; after 
simplification, the right side is now identical to the left side, and so the implication is 
proven. Now we have to refine x = x + yxn . 

x' = x + yxn if y=0 then ok else (x:= x+n; y:=y-l; x' = x + yxn) 

To prove it, let's start with the right side. 

if y=0 then ok else (x:= x+n; v:=y-l; x = x + yxn) Substitution Law twice 

— if y=0 then ok else x' = x +n + (y-l)xn now simplify 

if y=0 then ok else x = x + yxn expand ok and rewrite if 

y=0 a x'=x a y'=y v #0 a x'=x+y xn In the left disjunct, y=0 allows us to 

add 0 in the form of yxn to x. We drop y'=y . 
y=0Ax'=x+yxn v y+0 a x'=x+yxn boolean algebra 

— x'=x +yxn 

This latest refinement has not raised any new, unrefined specifications, so we now have a 
complete program. Using identifiers P , Q , and R for the three specifications that are not 
programming notations, we have 
P <= x;=n; Q; Q 
Q y:= x; x:= 0; R 

R if y=0 then ok else (x:= x+n; y:= y- 1; R) 

and we can compile it to C as follows: 

void P (void) {x = n; Q( ); Q( );} 
void Q (void) {y = x; x = 0; R( );} 
void R (void) {if (y==0) ; else {x = x+n; y = y— 1; R( );} } 
or, to avoid the poor implementation of recursive call supplied by most compilers, 
void P (void) {x = n; Q( ); Q( );} 

void Q (void) {y = x; x = 0; R: if (y==0) ; else {x = x+n; y = y-1; goto R; } } 

To account for time, we add a time variable t . We can account for real time if we 
know the computing platform well enough, but let's just count iterations. We augment the 
specifications to talk about time, and we increase the time variable each iteration. 
x'=«3 a t'=t+n2+n x\= n; x'=xxn a t'=t+x; x'=xxn a t'=t+x 

x'=xxn a t'=t+x y:= x; x:= 0; x' = x + yxn a t'=t+y 

x' = x + yxn a f=t+y \ — 

if y=0 then ok else (x:= x+n; y:= y-1; t:= t+ 1; x = x + yxn a t'=t+y ) 
We leave these proofs for the interested reader. 

Here is a linear solution in which n is a natural variable. We can try to find n3 in 
terms of (n-l)3 using the identity «3 = (n-l)3 + 3x«2 - 3 xn + 1 . The problem is the 
occurrence of «2 , which we can find using the identity n 2 = (ra-l) 2 + 2xn - 1 . So we 
need a variable x for the cubes and a variable y for the squares. We start refining: 
x'=n 3 \ — x'=nl a y'=n2 

x'=ni a y'=n2 if n = 0 then (x:= 0; y:= 0) else (n:= n— 1; x'=n 3 a y'=n2; 
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We cannot complete that refinement due to a little problem: in order to get the new values 
of x and y , we need not only the values of x and y just produced by the recursive call, 
but also the original value of n , which was not saved. So we revise: 
x'=ni \ — x'=«3 a y'=n2 a n'=n 

x'=n2 a y'=n2 a n'=n S — 

if n= 0 then (x:= 0; y:= 0) 

else ( n:= n— 1; x'=«3 a y'=n2 a n'=rv, n:= n+ 1; 

y:= y + n + n — 1; x:= x + y + y+ y — n-n-n + 1) 

After we decrease n , the recursive call promises to leave it alone, and then we increase it 
back to its original value (which fulfills the promise). With time, 
x'=n3 a t'=t+n S — x'=«3 a y'=n2 a n'=n a t'=t+n 
x'=n2 a y'=n2 a n'=n a t'=t+n 

if n= 0 then (x:= 0; y:= 0) 

else ( n:= n— 1; t:=t+l; x'=«3 a y'=n2 a n'=n a t'=t+n; n:= n+ 1; 
y:= y + n + n — 1; x:= x + y+ y+ y- n-n-n + 1) 

Compiling it to C produces 
void P (void) 

{if (n==0) {x = 0; y = 0;} 

else {n = n-1; P( ); n = n+1; y = y+n+n-1; x = x+y+y+y-n-n-n+1;} } 

Here is linear solution without general recursion. Let z be a natural variable. Let 
Q — y = 3xx2/3 + 3xvi/3 + 1 a z = 6xri/3 + 6 => x' = (xi/3+n)3 
or, more convenient for proof, 

Q — \/k: not- x=k3 a y = 3xk2 + 3 xk + 1 a z = 6 xk + 6 => x' = (k+n ) 3 

Then 

x'=ni a t'=t+n \ — ,r:= 0; y:= 1; z\= 6; Q a t'=t+n 
Q a t =t+n \ 

if n= 0 then ok 

else (x:= x+y, y:= y+z. z:= z+ 6; n:= n— 1; t:= t+ 1; QAt'=t+n) 

The proofs, which are just substitutions and simplifications, are left to the reader. 
Compiling to C produces 

x = 0; y = 1; z = 6; 

Q: if (n==0) ; else {x = x+y; y = y+z; z = z+6; goto Q;} 



4 Exact Precondition 

We say that specification S is refined by specification P if S«=P is a theorem. That 
means, quantifying explicitly, that 
Vo, o'- S <= P 

can be simplified to T . For any two specifications S and P , if we quantify over only 
the output variables o' , we obtain the exact precondition, or necessary and sufficient 
precondition (called “weakest precondition" by others) for S to be refined by P . For 
example, in one integer variable x , 

W- x'>5 <= (x-= x+ 1 ) 

— Vx'- x'>5 <= x'=x+ 1 One-Point Law 

— x+1 > 5 

= x > 4 

This means that a computation satisfying x:= x+1 will also satisfy x'>5 if and only if it 
starts with x>4 . (If instead we quantify over the input variables o , we obtain the exact 
(necessary and sufficient) postcondition.) 
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Now suppose P is an implementable and deterministic specification, and R' is a 
specification that refers only to output (primed) variables. Then the exact (necessary and 
sufficient) precondition for P to refine R' (“weakest precondition for P to establish 
postcondition R' ”) is 

Vo'- P => R' by a generalized one-point law 

= 3a'- P a R' 

— P-R 

where R is the same expression as R' except with unprimed variables. For example, the 
exact precondition for execution of x:= x+ 1 to satisfy x'>5 is 

jc:= jc+1 ; x>5 Substitution Law 

— x+l > 5 

— x > 4 



5 Probability 

A specification tells us whether an observation is acceptable or unacceptable. We now 
consider how often the various observations occur. For the sake of simplicity, this paper 
will treat only boolean and integer program variables, although the story is not very different 
for rational and real variables (summations become integrals). 

A distribution is an expression whose value (for all assignments of values to its 
variables) is a probability, and whose sum (over all assignments of values to its variables) is 
1 . For example, if n: nat+l (n is a positive natural), then 2 -n is a distribution because 
(V«: nat+l- 2-»: prob) a (Era: nat+l- 2-«)=l 

where prob is the reals from 0 to 1 inclusive. A distribution is used to tell the 
frequency of occurrence of values of its variables. For example, 2 -n says that ra has value 
3 one-eighth of the time. If we have two variables ra, m: nat+l , then 2-n-m is a 
distribution because 

(Vra, m: nat+l- 2-n-m: prob) a (Era, m: nat+l- 2-n-m)=l 
Distribution 2-n-m says that the state in which ra has value 3 and m has value 1 
occurs one-sixteenth of the time. 

If we have a distribution of several variables and we sum over some of them, we get a 
distribution describing the frequency of occurrence of the values of the other variables. If 
ra, m: nat+l are distributed as 2-n-m , then Em: nat+l - 2-n-m , which is 2 -n , tells us the 
frequency of occurrence of values of ra . 

If a distribution of several variables can be written as a product of distributions whose 
factors partition the variables, then each of the factors is a distribution describing the 
variables in its part, and the parts are said to be independent. For example, we can write 
2-n-m as 2 -n x 2 -m , so n and m are independent. 

The average value of number expression e as variables v vary over their domains 
according to distribution p is 
Ev- exp 

For example, the average value of ra 2 as ra varies over nat+l according to distribution 
2-n is Era: nat+l- ra - x 2 -n , which is 6 . The average value of n-m as ra and m vary 
over nat+l according to distribution 2-n-m is Era, m: nat+l- {n-m) x 2-n-m , which is 0 . 
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6 Probabilistic Specifications 

To facilitate the combination of specifications and probabilities, add axioms 
T = 1 
1 = 0 

equating booleans with numbers. 

Let S be an implementable deterministic specification. Let p be the distribution 
describing the initial state o . Then the distribution describing the final state o' is 
Ea- S x p 

which is a generalization of the formula for average. Here is an example in two integer 
variables x and y . Suppose x starts with value 7 one-third of the time, and starts with 
value 8 two-thirds of the time. Then the distribution of x is 
(x=7) x 1/3 + (x=8) x 2/3 
The probability that x has value 7 is therefore 
(7=7) x 1/3 + (7=8) x 2/3 
= T x 1/3 + 1 x 2/3 
= 1 x 1/3 + 0 x 2/3 

= 1/3 

Similarly, the probability that x has value 8 is 2/3 , and the probability that x has 
value 9 is 0 . Let X be the preceding distribution of x . Suppose that y also starts 
with value 7 one-third of the time, and starts with value 8 two-thirds of the time, 
independently of x . Then its distribution Y is given by 
Y = 0=7) / 3 + 0=8) x 2/3 
and the distribution of initial states is XxY. Let S be 

if x=y then (x:= 0; y:= 0) else (x:= abs(x-y); y:= 1) 

Then the distribution of final states is 
Ex, y- S x X x Y 

— Ex, y- (x=v ax'=/=0 v x=t=y a x'=abs(x—y) a y'= 1 ) 
x ((x=7) / 3 + (x=8) x 2/3) 
x ((y=7) / 3 + 0=8 ) x 2/3) 

= (x'=y'=0) x 5/9 + (x'=y'= 1 ) x 4/9 

We should see x'=y'=0 five-ninths of the time, and x'=y'=l four-ninths of the time. 

A probability distribution such as (x'=y'=0) x 5/9 + (x'=y'=l) x 4/9 describes what 
we expect to see. It can equally well be used as a probabilistic specification of what we want 
to see. A boolean specification is just a special case of probabilistic specification. We now 
generalize conditional composition and sequential composition to apply to probabilistic 
specifications as follows. If b is a probability, and P and Q are distributions of final 
states, then 

if b then P else Q — bxP + (1 -b) x Q 

P;Q — Eo"- (substitute o" for o' in P ) x (substitute a" for a in Q ) 
are distributions of final states. For example, in one integer variable x , suppose we start 
by assigning 0 with probability 1/3 or 1 with probability 2/3 ; that's 
if 1/3 then x:= 0 else x:= 1 

Subsequently, if x=0 then we add 2 with probability 1/2 or 3 with probability 1/2 , 
otherwise we add 4 with probability 1/4 or 5 with probability 3/4 ; that's 
if x=0 then if 1/2 then x:= x+2 else x:= x+3 
else if 1/4 then x:= x+4 else x:= x+5 

Notice that the programmer's if gives us conditional probability. Our calculation 
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if 1/3 then x:= 0 else x:= 1; 

if x=0 then if 1/2 then x:= x+2 else x:= x+3 

else if 1/4 then x:= x+4 else x:= x+5 

= ! Lx "■ ((x"=0)/3 + (x"=l)x2/3) 

x ( (x"=0) x ((x'=x"+2)/2 + (x'=x"+3)/2) 

+ (x"=l=0) x ((x'=x"+4)/4 + (x'=x"+5)x3/4)) 

= (x'=2)/6 + (x'=3)/6 + (x'=5)/6 + (x'=6)/2 

says that the result is 2 with probability 1/6 , 3 with probability 1/6 , 5 with 
probability 1/6 , and 6 with probability 1/2 . 

We earlier used the formula So- S xp to calculate the distribution of final states from 
the distribution p of initial states and an operation specified by S . We can now restate 
this formula as (//; S ) where p' is the same as p but with primes on the variables. And 
the formula (5; p) giving the exact precondition for implementable deterministic S to 
refine p' also works when S is a distribution. 

Various distribution laws are provable from probabilistic sequential composition. Let 
n be a number, and let P , Q , and R be probabilistic specifications. Then 
nxP\ Q — nx(P; Q) — P\ nxQ 
P+Q-R = ( P-R) + (Q-R ) 

P-Q+R = (. P-Q) + (P;R ) 

Best of all, the Substitution Law still works. (We postpone disjunction to Section 10.) 



7 Random Number Generators 

Many programming languages provide a random number generator (sometimes called a 
“pseudo-random number generator”). The usual notation is functional, and the usual result is 
a value whose distribution is uniform (constant) over a nonempty finite range. If n: nat+ 1 , 
we use the notation rand n for a generator that produces natural numbers uniformly 
distributed over the range 0 ,..n (from (including) 0 to (excluding) n ). So rand n has 
value r with probability (r: 0,..n) / n . (Recall: r: 0,..n is T or 1 if r is one of 0,1, 
2, ..., /7— 1 , and 1 or 0 otherwise.) 

Functional notation for a random number generator is inconsistent. Since x=x is a 
law, we should be able to simplify rand n = rand n to T , but we cannot because the two 
occurrences of rand n might generate different numbers. Since x+x = 2xx is a law, we 
should be able to simplify rand n + rand n to 2 x rand n , but we cannot. To restore 
consistency, we replace each use of rand n with a fresh integer variable r whose value has 
probability ( r:0,..n)/n before we do anything else. Or, if you prefer, we replace each use 
of rand n with a fresh variable r. 0 ,..n whose value has probability 1 In . (This is a 
mathematical variable, not a state variable; in other words, there is no r .) For example, 
in one state variable x , 

x:= rand 2; x:= x + rand 3 replace the two rands with r and s 

Sr: 0,..2- S.v: 0,..3- ( x:= r; x:= x + s) x 1/2 x 1/3 Substitution Law 

— Sr: 0,..2- S s: 0,..3- (x' = r+s ) / 6 sum 

((x' = 0+0) + (x' = 0+1) + (x' = 0+2) + (x' = 1+0) + (x' = 1+1) + (x' = 1+2)) / 6 
= (x'=0) / 6 + (x=l) / 3 + (x'=2) / 3 + (x'=3) / 6 

which says that x is 0 one-sixth of the time, 1 one-third of the time, 2 one-third of 
the time, and 3 one-sixth of the time. 

Whenever rand occurs in the context of a simple equation, such as r = rand n , we 
don't need to introduce a variable for it, since one is supplied. We just replace the deceptive 
equation with (r: 0,..«) / n . For example, in one variable x , 
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x:= rand 2; x:=x + rand 3 replace assignments 

— (x'\ 0,..2)/2; (x': x,..x+3)/3 sequential composition 

= ’Lx"- (x": 0,..2)/2 x (Y: x'',..x"+3)/3 sum 

= 1/2 x (x: 0,..3)/3 + 1/2 x (x': l,..4)/3 

= (.x'=0) / 6 + (jc'= 1) / 3 + (*=2) / 3 + <y=3) / 6 

as before. 

Although rand produces uniformly distributed natural numbers, it can be transformed 
into many different distributions. We just saw that rand 2 + rand 3 has value n with 
distribution (n=0 v n=3) / 6 + (n=l v n= 2) / 3 . As another example, rand % <3 has 
boolean value ft with distribution 
2 r. 0,..8- (ft = (r< 3)) / 8 
= (ft=T) x 3/8 + (ft= 1 ) x 5/8 
= 5/8 - ft/4 

which says that ft is T three-eighths of the time, and 1 five-eighths of the time. 



8 Blackjack 

This example is a simplified version of the card game known as blackjack. You are dealt a 
card from a deck; its value is in the range 1 through 13 inclusive. You may stop with 
just one card, or have a second card if you want. Your object is to get a total as near as 
possible to 14 , but not over 14 . Your strategy is to take a second card if the first is under 
7 . Assuming each card value has equal probability (actually, the second card drawn has a 
diminished probability of having the same value as the first card drawn, but let’s ignore that 
complication), we represent a card as (rand 13) + 1 . In one variable x , the game is 
x:= ( rand 13) + 1; if y< 7 then x:= x + ( rand 13) + 1 else ok 
First we introduce variables c, d: 0...13 for the two uses of rand , each with probability 
1/13 . The program becomes 

x:= c+ 1; if x<7 then x:= x+d + 1 else ok Substitution Law 

— if c+1 < 7 then x = c+d+2 else x = c+ 1 
Then x has distribution 

2c, d: 0...13- (if c+1 < 7 then x = c+d+2 else x = c+1) x 1/13 x 1/13 

by several omitted steps 

= ((2<t<7)x(Y- 1) + (7<*'<14)xl9 + ( 1 4</<20 )x(20-.r' )) / 169 

Alternatively, we can use the variable provided rather than introduce new ones, as follows. 
x:= ( rand 13) + 1; if y< 7 then x:= x + ( rand 13) + 1 else ok 

replace assignments and ok 

(pc': 1,..14)/13; if x<7 then (x': .y+1,...y+ 14)/13 else x'=x replace ; and if 
2 ,y"- (x": 1,..14)/13 x ((x"<7)x(x': x"+l,..x"+14)/13 + (x">7)x(x'=x")) 

by several omitted steps 

= ((2<x'<7)x(y'- 1) + (7<*'<14)xl9 + ( 1 4<*'<20)x(20-.«')) / 169 

That is the distribution of x if we use the “under 7 ” strategy. We can similarly find 
the distribution of x' if we use the “under 8 ” strategy, or any other strategy. But which 
strategy is best? To compare two strategies, we play both of them at once. Player x will 
play “under n ’’ and player y will play “under n+ 1 ” using exactly the same cards (the 
result would be no different if they used different cards, but it would require more variables). 
Here is the new game: 




Probabilistic Predicative Programming 



177 



if c+1 < n then x:= c+d+2 else x:= c+1 ; 
if c+1 < n+ 1 then y:= c+d+2 else v:= c+1; 

y<x<14 v x<14<y This line is the condition that x wins. We want to know 
the probability that it is true. Factor out x:= and y:= . 

— x:= if c+1 < n then c+d+2 else c+1 ; 
y:= if c+1 < n+ 1 then c+d+2 else c+1 ; 

v<v<14 v x<14<y Use the substitution law twice. 

(if c+l<;;+l then c+c/+2 else c+1) < (if c+ 1 <n then c+d+2 else c+1) < 14 
v (if c+ 1 </t then c+c/+2 else c+1) < 14 < (if c+l<»+l then c+cZ+2 else c+1) 

— c = n- 1 a d > 13 —n 
Now the probability that x wins is 

Zc, d: 0,.. 13- (c = n— 1 a d > 13— n) x 1/13 x 1/13 
= (n-1) / 169 

By similar calculations we can find that the probability that y wins is (14— n) / 169 , and 
the probability of a tie is 12/13 . For n< 8 , “under n+ 1 ’’ beats “under n For n> 8 , 
“under n ” beats “under n+ 1 So “under 8 ” beats both “under 7 ” and “under 9 



9 Dice 

If you repeatedly throw a pair of six-sided dice until they are equal, how long does it take? 
The program is 

R u:= ( rand 6) + 1; v:= (rand 6) + 1; if u=v then ok else ( f:= r+1; R) 

for an appropriate definition of R . First, introduce variables r, s: 0...6 , each having 
probability 1/6 . for the two uses of rand , and simplify by eliminating variables u and v. 
R if r=s then t'=t else (t:= t+ 1 : R) 

But there's a problem. As it stands, we could define R — if r=s then t'=t else t'=°° , 
which says that the execution time is either 0 or °° . The problem is that variable r 
stands for a single use of rand , and similarly for y . In the previous example, we had no 
loops, so a single appearance of rand was a single use of rand . Now we have a loop, and 
r has the same value each iteration, and so has s . The solution to this problem is to 
parameterize r and s by iteration or by time. We introduce r, s: time— »(0,..6) , with r t 
and y t each having probability 1/6 . The program is 

R if r t = s t then t'=t else (t:= t+ 1; R) 

Now we can define R to tell us the execution time. 

(Vi: t,..t '■ r i 4= s i) a r t' = s t' 

says that t' is the first time (from t onward) that the two dice are equal. The refinement is 
proved as follows: 

if rt=st then t'=t else (f- t+ 1; (Vi: ri+si) a rt'=st') case and substitution 

rt=st/\t'=t v rt+st a (Vi: t+l,..f ■ ri+si) a rf =sf axioms of V 

— (Vi: t,..t'- rij=si) a rt'=st' 

Treating ri and si as though they were simple variables, ri+si with probability 
Zri, si: 0,..6- ( ri+si ) x 1/6 x 1/6 — 5/6 
and ri=si with probability 

Zri, si: 0,..6- ( ri=si ) x 1/6 x 1/6 — 1/6 
So we offer the hypothesis that the final time t’ has the distribution 
(t'>t) x (5/6)t'-t x 1/6 

We can verify this distribution as follows. The distribution of the implementation (right 
side) is 
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Xrf, st- (if rt=st then t'=t else (t:= f+1; (t'>t) x (5/6 x 1/6)) x 1/6 x 1/6 sum 
(6 x (f=t) + 30 x (t:= t+ 1 ; ( f>t ) x (5/6)''-' x 1/6)) x 1/6 x 1/6 substitution 

(6 x (t'=t) + 30 x (t’>t+ 1) x (5/6)'-'-i x 1/6) x 1/6 x 1/6 arithmetic 

— (t'=t) x 1/6 + (t'>t+ 1) x (5/6)'-' x 1/6 

= x (5/6)''-' x 1/6 

The last line is the distribution of the specification, which concludes the proof. 

The alternative to introducing new variables r and s is as follows. Starting with the 
implementation, 

u:= ( rand 6) + 1; v:= (rand 6) + 1; 
if u=v then t'=t else (f:= 1+1; (t’>t) x (5/6)'-' x 1/6) 

— ((«': 1...7) a v'=v a t'=t)/ 6; (u'=u a (v': 1,..7) a t'-t)l 6; 
if u=v then t'-t else (f>t+\) x (5/6>'-'-i / 6 

= ((«': 1...7) a (v': 1...7) a r'=i)/36; 

if u = v then i'=i else (f'>i+l) x (5/6)f'-'-t / 6 
= Su", v": 1...7- Si"- (i"=i)/36 x ((«"=/') x (i=i") 

+ (w"+v") x (i'>i"+l) x (5/6 )''-<"- 1 / 6) 

— 1/36 x (6 x (i'=i) + 30 x (i'>i+l) x (5/6 )''-'-! / 6) 

= (1'>1) X (5/6)''-' x 1/6 

which is the probabilistic specification. 

The average value of f is 

Si'- 1' x (f>t) x (5/6)'-' x 1/6 — 1+5 
so on average it takes 5 additional throws of the dice to get an equal pair. 
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10 Nondeterminism 

According to some authors, nondeterminism comes in several varieties: angelic, demonic, 
oblivious, and prescient. To illustrate the differences, consider 
x:= rand 2; y:= 0 or y:= 1 

and we want the result x'=y' . If or is angelic nondeterminism, it chooses between its 
operands y:= 0 and y:= 1 in such a way that the desired result x'=y' is always achieved. 
If or is demonic nondeterminism, it chooses between its operands in such a way that the 
desired result is never achieved. Both angelic and demonic nondeterminism require 
knowledge of the value of variable x when choosing between assignments to y . 
Oblivious nondeterminism is restricted to making a choice without looking at the current (or 
past) state. It achieves x'=y’ half the time. Now consider 
x:= 0 or ,v:= 1 ; y:= rand 2 

and we want x'=y' . If or is angelically prescient, x will be chosen to match the future 
value of y , always achieving x'=y' . If or is demonically prescient, x will be chosen to 
avoid the future value of y , never achieving x'=y' . If or is not prescient, then x'=y' is 
achieved half the time. 

In predicative programming, nondeterminism is disjunction. Angelic, demonic, 
oblivious, and prescient are not kinds of nondeterminism, but ways of refining 
nondeterminism. In the example 

x\= rand 2; (y:= 0) v (y:= 1) 

with desired result x'=y' , we can refine the nondeterminism angelically as y:= x , or 
demonically as y:= l-x , or obliviously as either y:= 0 or y:= 1 . In the example 
(x:= 0) v (x:= 1); y:= rand 2 

with desired result x'=y' , we first have to replace rand 2 by boolean variable r having 
probability 1/2 . Then we can refine the nondeterminism with angelic prescience as x:= r , 
or with demonic prescience as x:= 1 —r , or without prescience as either x:= 0 or x:= 1 . 
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Suppose we have one natural variable n whose initial value is 5 . After executing the 
nondeterministic specification ok v («:= n+1) , we can say that the final value of n is 
either 5 or 6 . Now suppose this specification is executed many times, and the 
distribution of initial states is n= 5 ( n always starts with value 5 ). What is the 
distribution of final states? Nondeterminism is a freedom for the implementer, who may 
refine the specification as ok , which always gives the answer n'= 5 , or as n:= n+1 , which 
always gives the answer n'= 6 , or as 

if even t then ok else n:= n + 1 

which gives n- 5 or n’= 6 unpredictably. In general, we cannot say the distribution of 
final states after a nondeterministic specification. If we apply the formula So- Sxp to a 
specification S that is nondeterministic, the result may not be a distribution. For example, 
Sn- (ok v (n:= n+1)) x (n=5) — n'=5 v n'=6 
which is not a distribution because 
Z;?'- n'=5 v n'= 6 — 2 

Although n'= 5 v n’= 6 is not a distribution, it does accurately describe the final state. 

Suppose the initial value of n is described by the distribution (w=5)/2 + («= 6)/2 . 
Application of the formula So- Sxp to our nondeterministic specification yields 
S n- (ok v («:= h+1)) x ((«= 5)/2 + (n=6)/2) 

— (n'= 5 v n- 6 )/2 + (n= 6 v n'=7)/2 

Again, this is not a distribution, summing to 2 (the degree of nondeterminism). 
Interpretation of nondistributions is problematic, but this might be interpreted as saying that 
half of the time we will see either n'=5 or n'= 6 , and the other half of the time we will see 
either n'= 6 or n'= 7 . 

Nondeterministic choice (P v Q ), probabilistic choice (if rand 2 then P else Q), and 
deterministic choice (it b then P else Q) are not three different, competing ways of 
forming a choice. Rather, they are three different degrees of information about a choice. In 
fact, nondeterministic choice is equivalent to an unnormalized random choice. In one 
variable x , 

(x:= 0) v (x:= 1) 

— x': 0,..2 

2 x (x'\ 0,..2)/2 introduce rand the same way we eliminate it 

— 2x(/ = rand 2) 

= 2 x (x:= rand 2) 

^ x:= rand 2 

Thus we prove 

(x:= 0) v (x:= 1) ^ x:= rand 2 

which is the generalization of refinement to probabilistic specifications. Nondeterministic 
choice can be refined by probabilistic choice. More generally, 

P v Q — 2 x if rand 2 then P else Q 

It is a well known boolean law that nondeterministic choice can be refined by 
deterministic choice. 

P v Q if b then P else Q 

In fact, nondeterministic choice is equivalent to deterministic choice in which the 
determining expression is a variable of unknown value. 

P v Q — 3b: bool- if b then P else Q 

(The variable introduced is a mathematical variable, not a state variable; there is no b' .) 

This is what we will do: we replace each nondeterministic choice with an equivalent 
existentially quantified deterministic choice, choosing a fresh variable each time. Then we 
move the quantifier outward as far as possible. If we move it outside a loop, we must then 
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index the variable by iteration or by time, exactly as we did with the variable that replaces 
occurrences of rand . All programming notations distribute over disjunction, so in any 
programming context, existential quantifiers (over a boolean domain) can be moved to the 
front. Before we prove that specification R is refined by a program containing a 
nondeterministic choice, we make the following sequence of transformations. (The dots are 
the context, or uninteresting parts, which remain unchanged from line to line.) 

R <= (P v Q) 

R (3 b- if b then P else Q ) 

R <£= (3b (if b then P else Q) ) 

\/b- (R (if b then P else Q) ) 

A refinement is proved for all values of all variables anyway, even without explicit universal 
quantification, so effectively the quantifier disappears. 

With this transformation, let us look again at the example ok v («:= «+l) . With 
input distribution n= 5 we get 

En- (if b then ok else n:= n+ 1) x (n- 5) 
if b then n '= 5 else n= 6 
which is a distribution of n because 

Eif- if b then n '= 5 else n'= 6 

— if b then (E«'- n’= 5) else (E«'- n'= 6) 

— if b then 1 else 1 

With input distribution (n=5)/2 + (n= 6)/2 we get 

En- (if b then ok else n:= n+1) x ((«= 5)/2 + (n= 6)12) 

= if b then (n'=5)/2 + (n'= 6)/2 else (n'= 6)/2 + (n'=7)/2 

which is again a distribution of n . These answers retain the nondeterminism in the form 
of variable b , which was not part of the question, and whose value is unknown. 



11 Monty Hall's Problem 

To illustrate the combination of nondeterminism and probability, we look at Monty Hall's 
problem, which was the subject of an internet discussion group; various probabilities were 
hypothesized and argued. We will not engage in any argument; we just calculate. The 
problem is also treated in [4]. 

Monty Hall is a game show host, and in this game there are three doors. A prize is 
hidden behind one of the doors. The contestant chooses a door. Monty then opens one of 
the doors, but not the door with the prize behind it, and not the door the contestant has 
chosen. Monty asks the contestant whether they (the contestant) would like to change their 
choice of door, or stay with their original choice. What should the contestant do? 

Let p be the door where the prize is. Let c be the contestant's choice. Let m be the 
door Monty opens. If the contestant does not change their choice of door, the program is 
(p:= 0) v (/;:= 1) v (p:= 2); 
c:= rand 3; 

if c=p then (m:= c©l ) v (m:= c©2) else m:= 3 -c-p; 
ok 

The first line ( p:= 0) v ( p:= 1) v ( p:= 2) says that the prize is placed behind one of the 
doors; the contestant knows nothing about the criteria used for placement of the prize, so 
from their point of view it is a nondeterministic choice. The second line c:= rand 3 is the 
contestant's random choice of door. In the next line, © is addition modulo 3 ; if the 
contestant happened to choose the door with the prize, then Monty can choose either of the 
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other two (nondeterministically); otherwise Monty must choose the one door that differs 
from both c and p . This line can be written more briefly and more clearly as 
c'=c=t=/w'=l= p=p' . The final line ok is the contestant's decision not to change door. 

We replace rand 3 with variable r . We introduce variable P of type 0, 1 , 2 in order 
to replace the nondeterministic assignment to p with 

if P=0 then p:= 0 else if P=\ then p:= 1 elsep:= 2 
or more simply p:= P . And since we never reassign p , we really don't need it as a 
variable at all. We introduce variable M to express the nondeterminism in Monty's choice. 
Our program is now deterministic (in terms of unknown P and M ) and so we can append 
to it the condition for winning, which is c=P . We have 
c:= r; 

m:= if c=P then if M then cffi 1 else c©2 else 3 -c-P; 

c = P substitution law twice 

= r = P 

Not surprisingly, the condition for winning is that the random choice made by the contestant 
is the door where the prize is. Also not surprisingly, its probability is 
Z r- 0 r=P ) x 1/3 
= 1/3 

If the contestant takes the opportunity offered by Monty of switching their choice of 
door, then the program, followed by the condition for winning, becomes 
c:= r; 

m:= if c=P then if M then c© 1 else c©2 else 3 -c-P; 
c:= 3 -c—m; 
c = P 

In the first line, the contestant chooses door c at random. In the second line, Monty opens 
door m , which differs from both c and P . In the next line, the contestant changes the 
value of c but not to m ; thanks to the second line, this is deterministic; this could be 
written more briefly and more clearly as c+c r +m=m' . The final line is the condition for 
winning. After a small calculation ( c starts at r and then changes; the rest is irrelevant), 
the above four lines simplify to 
r*P 

which says that the contestant wins if the random choice they made originally was not the 
door where the prize is. Its probability is 
Zr- (r*P) x 1/3 
= 2/3 

Perhaps surprisingly, the probability of winning is now 2/3 , so the contestant should 
switch. 



12 Mr.Bean's Socks 

Our next example originates in [4]; unlike Monty Hall's problem, it includes a loop. 
Mr.Bean is trying to get a matching pair of socks from a drawer containing an inexhaustible 
supply of red and blue socks (in the original problem the supply of socks is finite). He 
begins by withdrawing two socks from the drawer. If they match, he is done. Otherwise, he 
throws away one of them at random, withdraws another sock, and repeats. The choice of 
sock to throw away is probabilistic, with probability 1/2 for each color. As for the choice 
of sock to withdraw from the drawer, we are not told anything about how this choice is 
made, so it is nondeterministic. How long will it take him to get a matching pair? 
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Here is Mr.Bean's program (omitting the initialization). Variables L and R represent 
the color of socks held in Mr.Bean's left and right hands. 

L'=R' <= 

if L=R then ok 

else ( if rand 2 then (L:= red ) v (L:= blue ) else (R:= blue) v ( R:= red)', 

/;= f+1; L ~R ) 

As always, we begin by replacing the use of rand by a variable h (for hand), and we 
introduce variable d to express the nondeterministic choices. Due to the loop we index 
these variables with time. The refinement 
L'=R' if L=R then ok 

else ( if h t then if d t then L:= red else L:= blue 
else if d t then R:= blue else R:= red\ 
t:=t+ 1; L'=R ') 

is easily proven. Now we need a hypothesis concerning the probability of execution times. 

Suppose the nondeterministic choices are made such that Mr.Bean always gets from the 
drawer a sock of the same color as he throws away. This means that the nondeterministic 
choices become 

if d t then L:= red else L:= blue — ok 
if d t then R:= blue else R:= red — ok 

(which means that d t just happens to have the same value as L=red a R=blue each time). 
If I were watching Mr.Bean repeatedly retrieving the same color sock that he has just thrown 
away, I would soon suspect him of doing so on purpose, or perhaps a malicious mechanism 
that puts the wrong sock in his hand. But the mathematics says nothing about purpose or 
mechanism; it may be just a fantastic coincidence. In any case, we can prove that execution 
takes either no time or forever 

if L=R then t'=t else t'=°° < x = 

if L=R then ok else (t:= t+ 1; if L=R then t'=t else t'=°°) 
but we cannot prove anything about the probability of those two possibilities. 

At the other extreme, suppose Mr.Bean gets from the drawer a sock of the opposite 
color as he throws away. Then the nondeterministic choices become 
if d t then L:= red else L:= blue — L:= R 
if d 1 then R:= blue else R:= red — R:= L 
(which means that d t just happens to have the same value as L-blue a R=red each time). 
Again, if I observed Mr.Bean doing that each time the experiment is rerun, I would suspect a 
mechanism or purpose, but the mathematics is silent about that. Now we can prove 
if L=R then t'=t else ?'=?+l 
if L=R then ok 

else ( if h t then L:= R else R := L; 

t:= t+ 1; if L=R then t'=t else t'=t+ 1) 

which says that execution takes time 0 or 1 , but we cannot attach probabilities to those 
two possibilities. If we make no assumption at all about dt , leaving the nondeterministic 
choices unrefined, then the most we can prove about the execution time is 
if L-R then t'=t else t’>t 

Another way to refine the nondeterministic choice is with a probabilistic choice. If we 
attach probability 1/2 to each of the values of dt , then the distribution of execution times 
is if L=R then t'=t else (t’>t) x 2 t-f . To prove it, we start with the right side of the 
refinement, weakening ok to t'=t . 




Probabilistic Predicative Programming 



183 



"Llit, dt- ( if L=R then t'=t 

else ( if ht then if dt then L:= red else L:= blue 
else if dt then R:= blue else R:= red ; 
t:= t+ 1; if L=R then t'=t else (t'>t) x 2t-t ’ ) ) 
x 1/2 x 1/2 factor and sum 

— if L=R then t'=t 

else ( (L:= red; t:= t+l; if L=R then t'=t else (f>t) x 2 t-i') 

+ ( L := blue; t:= t+l; if L=R then t'=t else (t'>t) x 2 t-t') 

+ ( R := blue; t:= t+l; if L=R then t'=t else ( t'>t ) x 2 t-t') 

+ (R:= red; t:= t+l; if L=R then t'=t else ( t'>t ) x 2 t-t') ) / 4 

Substitution Law 

— if L=R then t'=t 

else( (if red=R then t'=t+ 1 else (t'>t+l) x 2/+t-f') 

+ (if blue=R then t'=t+ 1 else (f'>t+l) x 2i+i-i') 

+ (if L=blue then t'=t+ 1 else (t'>t+ 1) x 2t+i-t') 

+ (if L=red then t'=t+ 1 else (f'>t+ 1) x 2»+t-f') ) / 4 

is either red or blue , and similarly L 

— if L=R then t'=t else (t'=?+l) / 2 + (t'>t+l ) x 2<+i-<' / 2 

— if L=R then t'=t else (f'>f) x 2 t-i 

which is the probability specification. That concludes the proof. The average value of / is 
2Y- t' x if L=R then t'=t else (t'>t) x 2 t-t' 

— if L=R then t else 5Y- t' x (t'>t) x 2 

— t + if L=/? then 0 else Era: naf+1- / 2« 

— t + if then 0 else 2 

so, if the initial socks don’t match. Mr.Bean draws an average of two more socks from the 
drawer. 

In the previous paragraph, we chose to leave the initial drawing nondeterministic, and to 
assign probabilities to the drawing of subsequent socks. Clearly we could attach 
probabilities to the initial state too. Or we could attach probabilities to the initial state and 
leave the subsequent drawings nondeterministic. The theory is quite general. But in this 
problem, if we leave both the initial and subsequent drawings nondeterministic, attaching 
probabilities only to the choice of hand, we can say nothing about the probability of 
execution times or average execution time. 



13 Partial Probabilistic Specifications 

Suppose we want x to be 0 one-third of the time. We don’t care how often x is 1 or 2 
or anything else, as long as x is 0 one-third of the time. To express the distribution of x 
would be overspecification. The first two lines below specify just what we want, and the 
last two lines are one way to refine the specification as a distribution, 
if 1/3 then x=0 else x=t= 0 

— (x=0)/3 + (x=l=0)x2/3 

> (*=0)/3 + (x= 1 )x2/3 

— if 1/3 then x=0 else x= 1 

In general, a superdistribution is a partial probabilistic specification, which can be refined to 
a distribution. In general, a subdistribution is unimplementable. 

Now suppose we want x to be 0 or 1 one-third of the time, and to be 1 or 2 one- 
third of the time. Two distributions that satisfy this informally stated specification are 
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(x=0)/3 + (jc= 2)/3 + (x=3 )/3 
(x=l)/3 + (x=3)x2/3 

The smallest expression that is greater than or equal to both these expressions (the most 
refined expression that is refined by both these expressions) is 
(x=0)/3 + (jc= 1)/3 + (x=2)/3 + (x=3)x2/3 
Unfortunately, this new expression is also refined by 
(x=2)/3 + (x=3)x2/3 

which does not satisfy the informally stated specification. The problem is known as convex 
closure, and it prevents us from formalizing the specification as a superdistribution. We 
must return to the standard form of specification, a boolean expression, this time about the 
partially known distribution. Let p x be the probability distribution of x . Then what we 
want to say is 

(Vx- 0<px< 1 ) a (Lx- px)= I a pO+pl - pl+p2 = 1/3 
This specification can be refined in the normal way: by reverse implication. For example, 
(Vx- ()<px<\) a (Ex- px)=\ a pO+pl = pl+p2 = 1/3 
pO = p2 = p3 = 1/3 a Vx: x+0 a.i+2a x=t=3- px= 0 
— Vx- px = ((x=0)/3 + (x=2)/3 + (x=3)/3) 



14 Conclusion 

Our first approach to probabilistic programming was to reinterpret the types of variables as 
probability distributions expressed as functions. In that approach, if x was a variable of 
type T , it becomes a variable of type T—>prob such that Ex — Ex' — 1 . All operators 
then need to be extended to distributions expressed as functions. Although this approach 
works, it was too low-level; a distribution expressed as a function tells us about the 
probability of its variables by their positions in an argument list, rather than by their names. 
So we opened the probability expressions, leaving free the variables whose probabilities are 
being described. 

By considering specifications and programs to be boolean expressions, and by 
considering boolean to be a subtype of numbers, we can make probabilistic calculations 
directly on programs and specifications. Without any new mechanism, we include 
probabilistic timing. From the distribution of execution times we can calculate the average 
execution time; this is often of more interest than the worst case execution time, which is 
the usual concern in computational complexity. 

We include an if then else notation (as is standard), and we have generalized booleans 
to probabilities (as in [4]), so we already have a probabilistic choice notation (for example, 
if 1/3 then P else Q ); there is no need to invent another. We have used the rand 
“function”, not because we advocate it (we don't), but because it is found in many 
programming languages; we cope with it by replacing it with something that obeys the 
usual laws of mathematical calculation. 

Informal reasoning to arrive at a probability distribution, as is standard in studies of 
probability, is essential to forming a reasonable hypothesis. But probability problems are 
notorious for misleading even professional mathematicians; hypotheses are sometimes 
wrong. Sometimes the misunderstanding can be traced to a different understanding of the 
problem. Our first step, formalization as a program, makes one's understanding clear. After 
that step, we offer a way to prove a hypothesis about probability distributions. 

Nondeterministic choice is handled by introducing a variable to represent the 
nondeterminacy. In [4], instead of calculating probabilities, they calculate a lower bound on 
probabilities: they find the precondition that ensures that the probability of outcome o' is 
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at least p . In contrast to that, from the distribution of prestates we calculate the entire 
range of possible distributions of poststates. With less mechanism we obtain more 
information. We did not treat nondeterministic choice and probabilistic choice as different 
kinds of choice; nondeterminism can be refined, and one way to refine it, is 
probabilistically; the “at least" inequality is the generalization of refinement. 

The convex closure problem, which prevents partial probabilistic specification, is a 
serious disappointment. It limits not only the work described in this paper, but any attempt 
to generalize specifications to probabilities, such as [4] where it is discussed at length. The 
only way around it seems to be to abandon probabilistic specification, and to write boolean 
specifications about distribution-valued variables. 

Probabilistic specifications can also be interpreted as "fuzzy” specifications. For 
example, (v'=0)/3 + (.r'=l)x2/3 could mean that we will be one-third satisfied if the result 
x' is 0 , two-thirds satisfied if it is 1 , and completely unsatisfied if it is anything else. 
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Abstract. A parallel prefix circuit takes n inputs xi, X2, •••, x n and 
produces the n outputs xi, xi o X2, ■ ■ . , xi o X2 ° ■ ■ ■ ° x n , where ‘o’ is an 
arbitrary associative binary operation. Parallel prefix circuits and their 
counterparts in software, parallel prefix computations or scans, have nu- 
merous applications ranging from fast integer addition over parallel sort- 
ing to convex hull problems. A parallel prefix circuit can be implemented 
in a variety of ways taking into account constraints on size, depth, or fan- 
out. Traditionally, implementations are either defined graphically or by 
enumerating the underlying graph. Both approaches have their pros and 
cons. A figure if well drawn conveys the possibly recursive structure of 
the scan but it is not amenable to formal manipulation. A description 
in form of a graph while rigorous obscures the structure of a scan and 
is equally hard to manipulate. In this paper we show that parallel pre- 
fix circuits enjoy a very pleasant algebra. Using only two basic building 
blocks and four combinators all standard designs can be described suc- 
cinctly and rigorously. The rules of the algebra allow us to prove the 
circuits correct and to derive circuit designs in a systematic manner. 



LORD DARLINGTON. . . . [Sees a fan lying on the table.] And what 
a wonderful fan! May I look at it? 

lady Windermere. Do. Pretty, isn’t it! It's got my name on it, 
and everything. I have only just seen it myself. It’s my husband's 
birthday present to me. You know to-day is my birthday? 

- Oscar Wilde, Lady Windermere's Fan 



1 Introduction 

A parallel prefix computation determines the sums of all prefixes of a given se- 
quence of elements. The term sum has to be understood in a broad sense: parallel 
prefix computations are not confined to addition, any associative operation can 
be used. Functional programmers know parallel prefix computations as scans, 
a term which originates from the language APL [1]. We will use both terms 
synonymously. 

Parallel prefix computations have numerous applications; the most well- 
known is probably the carry- lookahead adder [2], a parallel prefix circuit. Other 
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applications include the maximum segment sum problem, parallel sorting, solv- 
ing recurrences, and convex hull problems, see [3]. 

A parallel prefix computation seems to be inherently sequential. However, it 
can be made to run in logarithmic time on a parallel architecture or in hardware. 
In fact, scans can be implemented in a variety of ways taking into account 
constraints on measures such as size, depth, or fan-out. 

A particular implementation can be modelled as a directed acyclic oriented 
graph and this is what most papers on the subject actually do. The structure is 
a graph as opposed to a tree because subcomputations can be shared. Actually, 
it is an ordered graph, that is, the inputs of a node are ordered, because the 
underlying binary operation is not necessarily commutative. Here is an example 
graph. 




The edges are directed downwards; a node of in-degree two, an operation node, 
represents the sum of its two inputs; a node of in-degree one and out-degree 
greater than one, a duplication node , distributes its input to its outputs. 

Different implementations can be compared with respect to several different 
measures: the size is the number of operation nodes, the depth is the maximum 
number of operation nodes on any path, and the fan-out is the maximal out- 
degree of an operation node. In the example above the size is 74, the depth 
is 5, and the fan-out is 17. If implemented in hardware, the size and the fan-out 
determine the required chip area, the depth influences the speed. Other factors 
include regularity of layout and interconnection. 

It is not too hard but perhaps slightly boring - to convince oneself that 
the above circuit is correct: given n — 32 inputs aq, aq, . . . , x n it produces the 
n outputs x\, aq o X 2 , • • • , aq o aq o ■ ■ ■ o a: n , where ‘o’ is the underlying binary 
operation. The ‘picture as proof’ technique works reasonably well for a parallel 
prefix circuit of a small fixed width. However, an implementation usually defines 
a family of circuits, one for each number of inputs. In this case, the graphical 
approach is not an option, especially, when it comes to proving correctness. Some 
papers define a family of graphs by numbering the nodes and enumerating the 
edges, see, for instance, [4], While this certainly counts as a rigorous definition 
it is way too concrete: an explicit graph representation obscures the structure of 
the design and is hard to manipulate formally. 

In this paper we show that parallel prefix circuits enjoy a pleasant algebra. 
Using only two basic building blocks and four combinators all standard designs 
can be described succinctly and rigorously. The rules of the algebra allow us to 
prove the circuits correct and to derive new designs in a systematic manner. 

The rest of the paper is structured as follows. Section 2 motivates the basic 
combinators and their associated laws. Section 3 introduces two scan combi- 
nators: horizontal and vertical composition of scans. Using these combinators 
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various recursive constructions can be defined and proven correct, see Section 4. 
Section 5 discusses more sophisticated designs: minimum depth circuits that have 
the minimal number of operation nodes. Section 6 then considers size-optimal 
circuits with bounded fan-out. Finally, Section 7 reviews related work and Sec- 
tion 8 concludes. 



2 Basic Combinators 

This section defines the algebra of scans. Throughout the paper we employ the 
programming language Haskell [5] as the meta language. In particular, Haskell’s 
class system is put to good use: classes allow us to define algebras and instances 
allow us to define associated models. 



2.1 Monoids 

The binary operation underlying a scan must be associative. Without loss of gen- 
erality we assume that it also has a neutral element so that we have a monoidal 
structure. 

class Monoid a where 
£ :: a 

(o) :: a — > a — > a 

Each instance of Monoid must satisfy the following laws. 

e o x = x 

x o e = x 

X o (y o z) = (x O y) o z 

For example, the parallel prefix circuit that computes carries in a carry- 
lookahead adder is based on the following monoid. 

data KPG = K \ P\G 

instance Monoid KPG where 
£ = P 

Kof = K 

Pof = f 

Gof = G 

The elements of the type KPG represent carry propaga tion functions: K kills a 
carry (Ac — > 0), P propagates a carry (Ac — » c), and G generates a carry (Ac — » 
1). The operation ‘o’ implements function composition, which is associative and 
has the identity, P, as its neutral element. 
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2.2 The Algebra of Fans and Scans 

Reconsidering the example graph of the introduction we note that a parallel 
prefix circuit can be seen as a composition of fans. Here are fans of different 
widths in isolation. 

N m 

A fan adds its first input - counting from left to right - to each of its remaining 
inputs. It consists of a duplication node and n — 1 operation nodes. A scan is 
constructed by arranging fans horizontally and vertically. As an example, the 
following scan consists of three fans: a 3-fan placed below two 2-fans. 



Placing two circuits side by side is called parallel or horizontal composition, 
denoted ‘x’. 

n >< n = n n and i x m = i m 

Placing two circuits on top of each other is called serial or vertical composition, 
denoted We require that the two circuits have the same width. 



N N s 1 Nkl = 



Horizontal and vertical composition, however, are not sufficient as combining 
forms as the following circuit demonstrates (which occurs as a subcircuit in the 
introductory example) . 




At first sight, it seems that a more general fan combinator is needed. The fans 
in the middle part are not contiguous: the first input is only propagated to 
each second remaining input, the other inputs are wired through. However, a 
moment’s reflection reveals that the middle part is really the previous circuit 
stretched by a factor of two. This observation motivates the introduction of a 
stretch combinator: generalizing from a single stretch factor, the combinator ’ 
takes a list of widths and stretches a given circuit accordingly. 

[ 2 , 2 , 2 , 2 ] 



The inputs of the resulting circuit are grouped according to the given widths. In 
the example above, we have four groups, each of width 2. The last input of each 
group is connected to the argument circuit; the other inputs are wired through. 
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To summarize, the example parallel prefix circuit is denoted by the following 
algebraic expression ( fan i represents a fan of width i and idi represents the 
identity circuit of width i). 

fan 2 x fan 2 x fan 2 x fan 2 , 

[2, 2, 2, 2] v- (/an 2 x fan 2 % id i x fan 3 ) 5 
zdi x /an 2 x fan 2 x fan 2 x id i 

The following class declaration defines the algebra of fans and scans. Note 
that the class Circuit abstracts over a type constructor 7 which in turn is param- 
eterized by the underlying monoid. The type variable 7 serves as a placeholder 
for the carrier of the algebra. 

type Width = Nat- 

type Width + = Nat + 

class Circuit 7 where 

fan :: ( Monoid a) => Width —>7 a 

id :: Width — > 7 a 

(5) :: 7 a — > 7 a — » 7 a 

(x) :: 7 a — » 7 a — » 7 a 

(>— ) :: [ Widf/i + ] — > 7 a — * 7 a 

(— <) :: 7 a — > [ W*d£/i + ] —+7 a 

| • | :: 7 a — » Width 

The above class declaration makes explicit that only fans rely on the underlying 
monoidal structure; the remaining combinators can be seen as glue. The class 
additionally introduces a second stretch combinator ‘ — <’ which is similar to ’ 
except that it connects the first- input of each group to its argument circuit. The 
following pictures illustrate the difference between the two combinators. 

[2,3,1] >- fan 3 = J /an 3 -< [2,3, 1] = 

We shall see that ‘>— ’ is useful for combining scans, while <’ is a natural choice 
for combining fans. The list argument of the stretch combinators must contain 
positive widths ( Nat + is the type of naturals excluding zero). 

The width of a circuit, say /, is denoted |/|. Being able to query the width 
of a circuit is important as some combinators are subject to width constraints: 
/ , g is only defined if |/| = |gj, / — < x and x >— / require that |/| = ffx where 
ffx denotes the length of the list x. In particular, / — < [] is only valid if |/| = 0. 
We lift the width combinator to lists of circuits abbreviating [|/| | / <— fs] by 

I M- 

To save parentheses we agree that <’ and ’ bind more tightly than ‘x’, 
which in turn takes precedence over 




infixr 1 5 
infixr 2 x 
infix 4 >— 
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The fixity declarations furthermore ensure that the combinators bind less tightly 
than Haskell’s list concatenation As an example, fxg—<x-ti-y%h. 
abbreviates (/ x ( g — < (x -ft- y))) ? h. 

The following derived combinators will prove useful in the sequel. 

par :: ( Circuit 7) =>• [7 a] — > 7 a 

par = foldr (x) ido 

seq :: ( Circuit 7) => Width — > [7 a] — > 7 a 
seq n = foldr (?) id n 

The combinator par generalizes ‘x’ and places a list of circuits side by side. 
Likewise, seq generalizes ‘5’ and places a list of circuits above each other. 

infix 4 

(>~) :: ( Circuit 7) => [7 a] — » 7 a — » 7 a 

fsyf = par fs ? |/s | >-/ 

(a) :: ( Circuit 7) 7 a — » [7 a] — » 7 a 

/ A/s = / — < |/s| | par fs 

The combinators and ‘ A’ are convenient variants of ‘>— ’ and the expres- 

sion / -< [fi ,...,/„ ] connects the z-tli output of / to the first input of /, while 
[/1, . . . ,/„] >- / connects the last output of fi to the z-th input of /. Thus, V’ is 
similar to the composition of an n-ary function with n argument functions. 

In Haskell, we can model circuits as list processing functions of type [a] — * [a] 
where a is the underlying monoid. Serial composition is then simply forward 
function composition; parallel composition satisfies (/ x g) (xdhy)=fx-tt-gy 
where ‘-H-’ denotes list concatenation and |/| = \g\ = #y. Figure 1 displays 

the complete instance declaration, which can be seen as the standard model of 
Circuit. Put differently, the intended semantics of the combinators is given by the 
list processing functions in Figure 1 . Some remarks are in order. The expression 
Ex denotes the sum of the elements of the list x. The function group that is 
used in the definition of <’ and ’>— ’ takes a list of lengths and partitions its 
second argument accordingly. The expression [e|a<— x \ b <— j/ ] is a parallel 
list comprehension and abbreviates [e | (a, b) <— zip x y}. 

The algebraic laws each instance of the class Circuit has to satisfy are listed in 
Figure 2 . The reader is invited to convince themself that the instance of Figure 1 
is indeed a model in that sense. The list is not complete though: Figure 2 includes 
only the structural laws , rules that do not involve fans. The properties of fans 
will be discussed in separate paragraph below. Most of the laws except, perhaps, 
those concerned with <’ and ‘>— ’ are straightforward: ‘5’ is associative with 
id n as its neutral element; ‘x’ is associative with id 0 as its neutral element; 
‘x’ preserves identity and vertical composition. Most of the laws are subject to 
width constraints: (/ x g) ? (/' x g') = (/ ? /') x (g ? g '), for instance, is only 
valid if I/I = |/'| and \g\ = \g'\. Use of these laws in subsequent proofs will be 
signalled by the hint composition. 

Figure 2 only lists the laws for <’; its companion combinator ‘>— ’ satisfies 
analogous properties. The equations show that 1 — <’ preserves identity and com- 
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data Trans a = Trans{width :: Width , apply :: [ a \ — > [a]} 

instance Circuit Trans where 

fan n = Trans n (Xu — > case u of 

[]-[] 

a : as — > a : [a o 6 | 6 <— as]) 

= Trans n (Xu — > a) 

/ 9 Q = Trans \f\ (Xu -> apply g (apply f u)) 

f x g = Trans (\f\ + Is)) (Xu -> let (?/, 2 ) = splitAt |/| u 

in apply f y- H- apply g z) 

x >— f = Trans (Ex) (Xu — * let ys = group x u 

as = apply f [last y \ y <- ys] 
in concat [init y 4f [a] | y <— ys \ a <— as]) 
/ — < x = Trans (XJx) (Xu — <■ let ys = group x u 

as = apply f [head y \ y <- ys] 
in concat [[a] -H- tail y \ y <— ys \ a <— as]) 

I/I = width f 

group :: [ Int ] — ► [q] — >■ [[a]] 

group [] as = [ 

group (i : x) as = bs : group x cs 

where (6s, cs) = splitAt i as 



Fig. 1. The standard model of the scan algebra 



* d l/l 9/ = / 

/ 9 id\ f | = / 

/ 9 (9 9 h) = (f %g)%h 



\id n \ = n 
l/lsl = 1/1 = \g\ 
1/ x g\ = \f\ + \g\ 
\fan n \ = n 
1/ -< a; | = Ex 
\x>-f | = Ta: 



ido x / 
/ x ido 
f x (g xh) 
idm x idn 
(f X 9) 9 (/' X ff') 

-< a; 

/ — < replicate |/| 1 
(f° 9 g)-<x 
(/ x (?) — < (a: -H- y) 

(f -<x)-<y 
: i-i x (/-<?/ 4f [&]) 



= / 

= / 

= (/ X 5) x h 
— idm+n 

= (f 9 /') X (ff ? ff') 

= id Ex 
= / 

= (/-<*) 9 (p -< z) 

= (f -< x) x (g -< y) 

= f — < [X’^: | z <— group x y] 
= ([*] -H- 2/ >- /) x idk-i 



Fig. 2. The structural laws of the scan algebra 



position (replicate n a constructs a list containing exactly n copies of a). The 
second but last law in the right column demonstrates that nested occurrences of 
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stretch combinators can be flattened. The last equation, termed flip law , shows 
that <’ can be defined in terms of ’ and vice versa. Recall that ‘>— ’ connects 
last inputs and <’ connects first inputs. So strictly, only one stretch combinator 
is necessary. It is, however, convenient to have both at our disposal. Use of these 
laws will be signalled by the hint stretching. 

As a warm-up in scan calculations, let us derive two simple consequences, 
which we need later on. 

/ -< x -H- [j + k] = (/ -x x -H- [j]) x id k (1) 

(/ x id#y-i) -< x -H- y = / — < a; 4f [Zy] (2) 

The rules allow us to push the identity, id n , in and out of a stretch. To prove 
(1) we argue 

/ — < x -H- [j + k] 

= { flip law } 

([1] -tf x >- /) x id j+ k-i 
= { composition } 

([1] 4f x >— /) x idj - 1 x idk 
= { flip law } 

(/ — < z - H- [j]) x id k 

Property (2) is equally easy to show. 

(/ x i) — < x -H- y 

= { stretching } 

(/ — < x -H- [head y\) x ( id# y -\ — < tail y) 

= { stretching } 

(/ -< x -H- [head y]) x id 2 y) 

= { derived stretch law (1) } 

/ -< x-H- [Zy] 

Let us now turn to the axioms involving fans. Fans of width less than two 
are equal to the identities. 

fan 0 = id 0 
fan 1 = id i 

As an aside, this implies that the identity combinator, id n , can be defined as a 
horizontal composition of fans. 

id n = fan 1 x • • • x fan 1 

x ^ x 

n times 

The first non-trivial fan law, equation (3) below, allows the designer of scans to 
trade depth for fan-out. Here is an instance of the law. 
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The circuit on the left has a depth of 2 and a fan-out of 5 while the circuit on the 
right has depth 1 and fan-out 8. The first fan law generalizes from the example. 

fan 1+n -< [fan m -< fs ] 4f gs = fan m+n -< fs -H- gs (3) 

Interestingly, this rule is still structural as it does not rely on any properties of 
the underlying operator. Only the very last law, equation (4) below, employs 
the associativity of ‘o’. Before we discuss the rule let us first take a look at some 
examples. 




Both circuits have the same depth but the circuit on the right has fewer operation 
nodes. The left circuit consists of a big fan below a layer of smaller fans. The 
big fan adds its first input to each of the intermediate values; the same effect is 
achieved on the right by broadcasting the first input to each of the smaller fans. 
Here is the smallest instance of this optimization. 




The left circuit, id\ x fan 2 ? fan 3 = id 2 -< [id\,fan 2 ] ? fan 3 , maps the inputs x±, 
X2, X3 to the outputs xi, X\ 0x2, X\ o (x2 013), while the right circuit, fan 2 x id\ , 
id 1 x fan 2 = fan 2 -< [fan 1 , fan 2 ], maps x\, X2, X3 to x\, x\ o X2, (xi o X2) o X3. 
Clearly, the outputs are equal if and only if ‘o’ is associative. However, the first 
circuit consists of three operation nodes while the second requires only two. The 
second fan law captures this optimization. 

id 1+#x -< [id i]- H- [farij \j <-y ]° 9 fan i+Sx 
= f an i+# x -< U an i) df [/an, I j y] (4) 

The size of the circuit of the right-hand side is always at most the size of the 
circuit on the left-hand side. Unless all the ‘small’ circuits are trivial, the depth 
of both circuits is the same. Thus, the second fan law is the central rule when it 
comes to optimizing scans. 

In the sequel we will also need the following derived law, which is essentially 
a binary version of the second fan law. 

id m x }an n+1 5 fan m+n+1 = fan 1+m x id n 5 id m x fan n+1 (5) 

We argue as follows. 

idm x 9 fan m+n+ i 

= { second fan law } 
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fan 2 -< [fan m ,fan n+1 ] 

= { stretching } 

fan 2 A [fan m A replicate m idi, fan n+1 } 

= { first fan law } 

fan l+ m A replicate m idi df [fan n+1 ] 

= { definition of ‘A’ } 

fan 1+m — < replicate m 1 -H- [n + 1 ] , par ( replicate m idi -H- [ fan n+1 ]) 
= { derived stretch law (1) } 

( fan 1+m — < replicate m 1 4 f [1]) 

xid n 9 par ( replicate m idi -H- [ fan n+1 ]) 

= { stretching } 

fan 1+m x id n , par ( replicate m idi 4 f [/an n+1 ]) 

= { composition } 

fan 1+m x id n ? id m x fan n+1 

3 Serial and Parallel Scan Combinators 

The combinators we have seen so far are the basic building blocks of scans. The 
blocks can be composed in a multitude of ways, the resulting circuits not nec- 
essarily implementing parallel prefix circuits. By contrast, the combining forms 
introduced in this section take scans to scans, they are scan combinators. 

Before we proceed, we should first make precise what we mean by ‘scan’ in 
our framework. Scans are, like fans, parameterized by the width of the circuit. 
We specify 

scano = ido 
scan n + 1 = succ scan n 

where succ is given by 

succ :: ( Circuit 7, Monoid a) =t- 7 a — » 7 a 

succ f = idi x / ? fan\ f \ +1 

Whenever we introduce a new implementation of scans in the sequel, we will 
show using the laws of the algebra that the family of circuits is equal to scan n . 

The first scan combinator implements the serial or vertical composition of 
scans: the last output of the first circuit is fed into the first input of the second 
circuit. 

infixr 3 ^ 

(^\) :: ( Circuit 7) 470-470 - > j a 

f\g = f x id \g\-i 9 *d|/|-i x g 

Because of the overlap the width of the resulting circuit is one less than the sum 
of the widths of the two arguments: \f \ g\ = \f\ + |g| — 1. The depth does not 
necessarily increase, as the following example illustrates. 
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The rightmost operation node of the first circuit is placed upon the uppermost 
leftmost duplication node of the second circuit. 

Serial composition of scans is associative with id\ as its neutral element. 

idi\f = f 
f\idi = f 
f\(g\ h ) = ( f\g)\ h 

The first two laws are straightforward to show; the proof of associativity is quite 
instructive: it reveals that f \ (g \ h) and (f \ g) \ ft are even structurally 
equivalent, that is, they can be rewritten into each other using only structural 
rules. 

= { definition of } 

/ x *d| 9 |+|h|-i ? *d|/|-i x( 9 \h) 

= { definition of } 

/ x id\ g \ + \ h ^i 5 id f i x (g x id\ h y x ? id ,, x ft) 

= { composition } 

f x | 1 h-| — i 9 'i j d\f\ — i x g x id |/i |— i 9 ^|/|+| 9 |— 2 x ft. 

= { composition } 

(/ x id\ g \^ x ? * rf |/|-i x g) x id\ h |_i ? fd|/|+| ? |_ 2 x ft 
= { definition of } 

(. f\g)x id W _ 1 5 id\ f \ + \g\_ 2 x ft 

= { definition of } 

(f\g)\ h 

Serial composition interacts nicely with stretching. Let Jfx = |/| — 1 and = 
\g\, then 

if \g) — < a; 4 f y = (/ -< x 4 f [1]) ^ {g -< y) (6) 

The proof builds upon the derived stretch laws. 

if \ 9) -<■ x -H- y 

= { definition of } 

(/ x id ,, . 1 5 id|/|_i x g) — < x -H- y 
= { stretching } 

(/ x id ,, .1) -< x -H- y ? (*d|/|_i xj)-<i + !/ 

= { derived stretch laws (1) and (2) } 
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(/ -< x 4 f [1]) x id Sy - 1 ? ( /'/ / i x g) -< x -H- y 
= { stretching } 

(/ -< x -H- [1]) x idjjy - 1 5 idjja, x (g -< y) 

= { definition of } 

(/ -< x -H- [ 1 ]) ^ (5 -< y) 

The second scan combinator is the parallel or horizontal composition of scans: 
both circuits are placed side by side, an additional fan adds the last output of 
the left circuit to each output of the right circuit. 

infixl 3 [] 

([]) :: ( Circuit 7, Monoid cr) =>70— » 7 a — > 7 a 

/[Iff = / x ff x fan lgl+1 

The widths sum up: \f [] g\ = |/| + |ff|- Because of the additional fan the depth 
increases by one. Here is an example application of 




Before we turn to the algebraic properties of ‘O’, let us first note that the parallel 
composition of scans is really a serial composition in disguise. 

/Off = f\ succ g 

The proof is straightforward. 

/ D ff 

= { definition of ‘ 0 ’ } 

/ x g ? *d|/|-i x /an M+1 
= { composition } 

/ x id\ g \ ? id\ f \ x g 5 id\ f |_i x fan lgl+1 
= { composition } 

/ x id\ g \ 5 «d|/|_i x (id 1 x g ? /on !s | +1 ) 

= { definition of } 

f\(id 1 x g $fan \ g | +1 ) 

= { definition of succ } 

/ ^ succ g 

Parallel composition is associative, as well, and has ido as its right unit. It does 
not possess a left unit though as id 0 [] / is undefined (the first argument must 
have a positive width). 

/Mo = / 

/ D (ff 0 h ) = (/ D a) D h 
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As opposed to serial composition, the circuits / Q (g 0 h) and (/ 0 g) 0 h are not 
structurally equivalent: the latter circuit has fewer operation nodes. The proof 
rests upon the above characterization of parallel composition. 

/ D (9 D h) 

= { characterization of ‘ [] ’ } 

/ \\ succ (g\ succ h ) 

= { see below } 

/ \\ succ g \ succ h 
= { characterization of ‘ [] ’ } 

(/ D 9) 0 h 

The second step is justified by the following calculations. 
succ (/ succ g) 

= { definition of succ and } 

idixfx id\ g \ ? id|/|+i x g % id jf | x fan lgj+1 ? fan jf |+| g | +1 
= { derived fan law (5) } 

id! x / x id\ g \ ? id|/|+i x g ? fan^ +1 x id\ g \ ? id \ f \ x fan | fl | +1 
= { composition } 

id\ x f x id\ g \ ? fan w+1 x id\ g \ ? *d|/|+i x g ? id \ f \ x fan^ +1 
= { definition of succ and } 

succ f \ succ g 

Since the proof relies on the second fan law, succ f \ succ g has fewer nodes 
than succ (/ ^ succ g). 

Let us finally record the fact that succ, and ‘Q’ are scan combinators. 
succ scan n = scan n + i 

scan m + i ^ scan n = scan m+n 
scan m [] scan n = scan m + n 

The first law holds by definition. The third equation implies the second and the 
third equation can be shown by a straightforward induction over m. 

4 Simple Scans 

It is high time to look at some implementations of parallel prefix circuits. We 
have already encountered one of the most straightforward implementations, a 
simple nest of fans, which serves as the specification. 

scann = id o 
scan n + 1 = succ scan n 

Here is an example circuit of width 8. 
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The circuit scan n is, in fact, the worst possible implementation as it has maxi- 
mum depth and the maximal number of operation nodes, namely, n * (n — 1) / 2 
among all scans of the same width. Since succ f = id\ \ succ f = id\ [] /, we can 
alternatively define scan n as a parallel composition of trivial circuits. 

scarin+i = id i [] scan n 

Now, if we bracket the parallel composition differently, we obtain the serial 
scan , whose correctness is immediate. 

sero = id o 
seri = id i 
ser n+ 1 = ser n [] idi 

The graphical representation illustrates why ser n is called serial scan. 




The serial scan has maximum depth, but the least number of operation nodes, 
namely, n — 1 among all scans of the same width. In a sequential language ser n 
is the implementation of choice; it corresponds, for instance, to Haskell’s scanl 
operation. Using / [] id\ = f \ succ id\ = f\ fan 2 we can rewrite the definition 
of ser n to emphasize its serial nature. 

ser„+i = ser n \fan 2 

Now, if we balance the parallel composition more evenly, we obtain parallel 
prefix circuits of minimum depth. 



| n ^ 1 = id n 

| otherwise = rec |-„/ 2 -| fl rec^ n / 2 j 

Here is a minimum-depth circuit of width 32. 
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Note that the tree of operation nodes that computes the last output is fully 
balanced, which explains why the depth is minimal. If the width is not a power of 
two, then rec n constructs a slightly skewed tree, known as a Braun tree [6]. Since 
‘O’ is associative, we can, of course, realize arbitrary tree shapes; other choices 
include left-complete trees or quasi left-complete trees [7]. For your amusement, 
here is a Fibonacci-tree of width 34 




defined in the obvious way. 



fib o 


II 

<S5. 

O 


fibi 


II 

<S5. 


fib n + 2 


1 D fi^n 



5 Depth-Optimal Scans 

5.1 Brent-Kung Circuits 

The rec n family of circuits implements a simple divide-and-conquer scheme. A 
different recursive decomposition was devised by Brent and Kung [8]. As an 
example, here is a Brent-Kung circuit of width 32. 




The inputs are ‘paired’ using a layer of 2-fans. Every second output is then fed 
into a Brent-Kung circuit of half the width; the other inputs are wired through. A 
final layer of 2-fans, shifted by one position, distributes the results of the nested 
Brent-Kung circuit to the wired-througlr signals. Every recursive step halves the 
number of inputs and increases the depth by two. Consequently, Brent-Kung 
circuits have logarithmic but not minimum depth. On the other hand, they use 
fewer operation nodes than the rec n circuits and furthermore they have only a 
fan-out of 2! 
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Turning to the algebraic description, we note that the first layer of 2-fans 
can be generalized to a layer of scans of arbitrary, not necessarily equal widths. 

(o) :: ( Circuit 7, Monoid a) => [7 a] — > 7 a — > 7 a 

[]»ff = 9 

(/ : fs) o 9 = (/ : fs) >- 9 ? *rf|/|_i x par gs 

where gs = [fan^ | | / <- /s] 4f [idi] 

Each scan, except the first one, is complemented by a fan in the final layer, 
shifted one position to the left. The operator ‘o’ is also a scan combinator; it 
takes a list of scans and a scan to a resulting scan. 

[scani | i *— x] o scan# x = scan^x (7) 

The Brent-Kung circuit is given by the following definition. 

bk n 

| n ^ 1 = id n 

| otherwise = ( replicate \n/ 2\ fan 2 -H- [idi \ odd n]) o bk \ n /$\ 

The nested scan has width \n/ 2] : if the number of inputs is odd, then the nested 
scan additionally takes the last input. As an aside to non-Haskell experts, the 
idiom [e | b] is a trivial list comprehension that evaluates to [] if b is False and 
to [e] if b is True. Furthermore note, that bk n is a so-called restricted parallel 
prefix circuit, whose last output has minimum depth. 

The Brent-Kung decomposition is based on the binary number system. Since 
the operator ‘o’ works for arbitrary scans, it is not hard to generalize the de- 
composition to an arbitrary base. 

gbk b n 

| n ^ b = ser n 

| r == 0 = ( replicate d serf) o gbk b d 

| otherwise = ( replicate d seri, -H- [ ser r ]) o gbk b (d + 1) 

where ( d,r ) = divMod n b 

The definition of gbk uses serial scans as ‘base’ circuits. This is, of course, an 
arbitrary choice; any scan will do. Here is a base-3 circuit of width 27. 




This circuit has size 46 and depth 8, while its binary cousin has size 47 and 
depth 10. 
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Let us turn to the proof that ‘o’ is a scan combinator. Property ( 7 ) can be 
proven by induction over the length of x. We confine ourselves to showing the 
induction step. Let k = ffss = fffs, i = |s|, j = | head ss|, n = j + S\fs\ and 
finally fs = [fan | s | | s tail ss ] 4 f [ fan 1 ] , then 

s : ss o scank+i 
= { definition of ‘o’ } 

s : ss >~ scank+i ? idi - 1 x par (fanj : fs) 

= { property of scan } 

s : ss >- (id i [] scank) ? idi - 1 x par ( fanj : fs) 

= { definition of ‘ Q ’ } 

s : ss >~ (id i x scank ? }an k+1 ) , idi - 1 x par (fanj : fs) 

= { stretching } 

s : ss >~ (zdi x scank ? /an fc+1 ) , zdj_i x (zdfc+i -X /an ^ : /s) 

= { shift law (8), see below } 

s : ss >~ (id i x scan k ) 5 idi-\ x (fan k+1 -< fanj : fs) 

= { fan law } 

s : ss >- (zdi x scank) ? zd,-i x (idfc+i -X idj : fs , }an n ) 

= { stretching } 

s : ss >~ (id\ x scank) ? zdj-i x (par (idj : fs) 9 fan n ) 

= { composition } 

s x (ss >- scank) ? zdj-i x (par (idj :fs) 5 fan n ) 

= { composition } 

s x (ss y scank) 9 idi - 1 x par (idj : /s) , idi - 1 x /an„ 

= { composition } 

s x (ss >~ scank 9 par ( id r . i : /s)) ? x /an n 
= { definition of “[]’ } 

s [] (ss >- scan k 9 par (idj - 1 : /s)) 

= { definition of par } 

s [] (ss >~ scan k 9 idj- 1 x par /s) 

= { definition of ‘o’ } 

s | (ss o scank) 

The shift law , used in the fourth step, is a combination of the flip law and the 
laws for stretching. 



(fs y (l m)) x f 5 g x (r -< gs) = (fsyl)xf° 9 gx ((m ? r) ^ gs) (8) 




An Algebra of Scans 203 



We reason as follows. 

(fs y ( l 5 m)) x / 5 g x (r A gs) 

= { composition } 

par fs x / 5 (|/s| >— (l ? mj) x |/| ? id\ g \ x (r -< |gs|) ? g x par gs 
= { flip law } 

par fs x / 5 zd| s | x ((l ? m) — < |gs|) 5 id\ g \ x (r -< |gs|) 5 g x par gs 
= { stretching } 

par fsxf$ id\ g \ x (l -< |gs|) 5 m?| s | x ((m 5 r) -< |gs|) ? g x par gs 
= { flip law } 

par /s x / 5 (|/s | >- 0 x zd|/| ? zd| ff | x ((m ? r) — < |gs|) $ g x par gs 
= { composition } 

(fsy l) xf ° 9 g x {{m 5 r) A gs) 



5.2 Ladner-Fischer Circuits 

Can we combine the good properties of rec and bk - rec has minimum depth, 
while bk gets away with fewer operation nodes? Yes, we can! Reconsider the 
circuit rec 32 in Section 4 and note that the left part does not occupy the bottom 
level. The idea, which is due to Ladner and Fischer [9], is to use the Brent-Kung 
decomposition for the left part - recall that it increases the depth by two - and 
the ‘usual’ decomposition for the right part. The following combinator captures 
one step of the Brent-Kung scheme. 

double :: (Circuit 7, Monoid a) => ( Width — » 7 a) — » ( Width —>7 a) 
double s n = ( replicate [n/ 2 J fan 2 4f [id\ \ odd n]) o s |"n/2] 

Using double we can define a depth-optimal parallel prefix circuit that has the 
minimal number of operation nodes among all minimum-depth circuits [10]. 

opt n 

| n ^ 1 = id n 

| otherwise = double opt |"n/2] |] opt |_^/2J 

The following example circuit of width 32 illustrates that all layers are nicely 
exploited. 




The size of the circuit is 74. By contrast, rec 32 consists of 80 operation nodes. 

The double combinator allows the scan designer to trade depth for size. The 
Ladner-Fischer circuit , defined below, generalizes opt introducing the notion of 
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extra depth : the first argument of If specifies the extra depth that the designer 
is willing to accept in return for a smaller size. 



If k 0 
If k 1 
If 0 n 

If (k + 1) n 



id o 
id i 

If 1 \n / 21 D If 0 Ln/2j 
double (If k) n 



It is not hard to see that If 0 specializes to opt and If oo specializes to bk . In a 
sense, Ladner-Fisclrer mediates between the two recursive decompositions. 



6 Size-Optimal Scans 

6.1 Lin-Hsiao Circuits 

The ‘o’ combinator constructs a slightly asymmetric circuit: not every scan 
has a corresponding fan. Circuits with a more symmetric design were recently 
introduced by Lin and Hsiao [4]. As an example, here is one of their circuits of 
width 25, called wl§. 




Every scan in the upper part is complemented by a corresponding fan in the 
lower part. The two parts are joined by a ‘-/“’-like shape (turned 90° degrees 
clockwise) that connects the first input to the last output. The ‘_r’ combinator 
is easy to derive. 



S CCLTL 

■= { property of scan } 

id i [] scan n 

= { definition of “[]’ } 

id i x scan n % fan n+1 
= { stretching } 

[idi, scan n \ >- idi ,/an„ +1 
= { fan laws } 

[id i, scan n ] >- id 2 5 fan 2 A [fan n ,id\] 
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Thus, we define 

(~r) :: ( Circuit 7, Monoid a) ^ 7 a - * "/ a — > j a 

f -r g = [idi,f] >- id 2 9 fan 2 [g, idi] 

We have | f ~r g\ = |/| + 1 = \g\ + 1. The derivation above implies that scan n ~r 
fan n = scan n +\. The ‘~r ’ combinator constructs a so-called zig-zag circuit whose 
height difference is one. The height difference is the length of the path from the 
first input to the last output. The low height difference of one renders zig-zag 
circuits attractive for serial composition. This is utilized in [4] to construct size- 
optimal circuits. A size-optimal circuit has the minimal number of operation 
nodes among all circuits of a fixed given depth. 

Perhaps surprisingly, a serial composition of two zig-zag circuits can again 
be written as a zig-zag circuit. 

{h -r Ui) \ (I2 -r U2) = ([iii h] y scan2) ~r (fan 2 A [ui, w 2 ]) ( 9 ) 

To justify this we argue (*i = |/i| = |mi| and i 2 = \h\ = |«2|) 

{h -r ui) \ (h -r M2) 

= { definition of } 

(h mi) x id i2 , idi 1 x {l 2 ~r u 2 ) 

= { definition of ‘_r’ } 

{id 1 x l\ , fan 2 A [ Mi, idi]) x idi 2 , id ^ x {idi x l 2 fan 2 A [ M2, idi]) 

= { composition } 

id 1 xhxl 2 °, {fan 2 — < [ii, 1 ]) x idi 2 9 id ^ x {fan 2 A [i 2 , 1 ]) u\X u 2 x idi 
= { derived stretch law (6) } 

id 1 x l\ x l 2 5 {fan 2 \fan 2 ) — < [*i,*2,l] 9 Mi x u 2 x idi 
= { fan 2 \fan 2 = scans = idi 0 scan 2 = idi x scan 2 ? fan 3 } 
id 1 x l\ x l 2 9 {id 1 x scan 2 ? fan 3 ) — < [ii, i 2 , 1] , Mi x u 2 x idi 
= { stretching } 

id 1 xhxl 2 °, [1, ii, i 2 ] >— {id 1 x scan 2 ) 9 fan 3 — < [A, i 2 , 1] , Mi x M2 x id 1 
= { composition } 

[id\, h, l 2 ] >- {idi x scan 2 ) ? fan 3 A [mi, u 2 , id 1] 

= { composition and fan law } 

id 1 x {[h,l 2 ] >- scan 2 ) , fan 2 A [ fan 2 A [Mi,M2],idi] 

= { definition of ‘_r’ } 

{[h,k] >~ scan 2 ) _r {fan 2 A [mi,m 2 ]) 

The property can even be generalized to an n-fold composition. 

{h -I~ Ui) \ \ {l n -T u n ) = ([?i , . . . ,l n ] >- scan n ) ~r {fan n A [mi, . . . , m„]) 
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The proof of this property proceeds by a simple induction. We only show the 
induction step. 

(^1 — ^ r Ul) ^ * * * ^ {In U n ) ^ {ln+ 1 U n _fi) 

= { ex hypothesi } 

(( [ (l ; * * • ? In ] SC(l7l n ) — C - {f(lTl n “<! [ Ui , . . . , ])) ^ (^n+1 — W n _|_i ) 

= { see above } 

([[Zi, scan n , l n + i] >- scan2) ~r {fan 2 -< [fan n -< [m, . . . , u„], u„+i]) 

= { scan law (10), see below } 

{[h, ■ ■ ■ , In, ln+i] y scan n+ i) _r ( fan 2 -< [fan n -< [ui,. . . ,u n \,u n+1 ]) 

= { fan law } 

( [ (l 5 • • • 5 ln+l] y SCan n -\. l) -f~ {fdTl n ^_i [ U\ , . . . , W n _|_l ]) 

The scan law used in the third step is analogous to the first fan law. 

[/s >~ scan m ] -H- gs >- scani+„ = /s -H- gs >- scan m + n (10) 

The proof is left as an exercise to the reader. 

To summarize, ‘_r’ combines a tree of scans with a corresponding tree of 
fans to a scan. The combinator allows us to shape a scan after an arbitrary tree 
structure. This makes it easy, for instance, to take constraints on the fan-out into 
account - the fan-out corresponds directly to the degree of a tree. As an example, 
let us define the Lin-Hsiao circuit wl shown above. The following Haskell data 
declaration introduces a suitable tree type and its associated fold operation. 

data Tree a = Leaf a \ Node [Tree a] 

fold :: {a — » ff) — » ([/?] —>/?)—» {Tree a — > ( 3 ) 

fold leaf node {Leaf a) = leaf a 

fold leaf node {Node ts) = node [fold leaf node t \ t <— ts] 

The scan tree and the fan tree of a zig-zag circuit can be implemented as two 
simple folds. 

zig-zag :: {Circuit 7, Monoid a) => Tree Width — > 7 a 

zig-zag t = fold ser s-node t _r fold fan f-node t 

s-node ts = ts >- ser#ts 
f-node ts = fan^. ts -< ts 

The ‘base’ circuit of the s-node can be any scan. The same is true of the f-node 
- recall that the first fan law allows us to rewrite a single fan as a nest of fans. 

Now, the tree underlying the wl circuit is given by the following definition 
(note that the argument does not correspond to the width). 

wl-tree 5 = Node [Leaf 4, Leaf 4, Leaf 4] 

wTtree n +i = Node [wl-tree n , wl-tree n ] 
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The circuit is then simply defined as the composition of zig-zag and wl-tree. 
wl n = zig-zag wl-tree n 

Lin and Hsiao show that a slightly optimized version of wl n - using the first fan 
law the two 2-fans in the center are merged into a 3-fan - is size-optimal [4] . 

6.2 Brent-Kung, Revisited 

Interestingly, the Brent-Kung circuit can be seen as a zig-zag circuit in disguise, 
or rather, as a serial composition of zig-zag circuits. Reconsider the example 
graph given in Section 5.1 and note that the right part has the characteristic 
shape of a zig-zag circuit: the tree in the upper part is mirrored in the lower 
part, in fact, they can be mapped onto each other through a 180° rotation (this 
is because binary fans and binary scans are equal) . 

The tree shape underlying a Brent-Kung circuit is that of a Braun tree. 

brawn n 

| n ^ 2 = Leaf n 

| otherwise = Node [braunr n /z ] , braun\ n / 2 \ } 

Here is the alternative definition of Brent-Kung as a serial composition of zig-zag 
circuits. 

bk' n 

| n ^ 2 = ser n 

| otherwise = bk' d+r \ zig-zag braund 
where (d,r) = divMod n 2 



The graphical representation reveals that this variant is more condensed: every 
fan is placed at the topmost possible position. 




7 Related Work 

Parallel prefix computations are nearly as old as the history of computers. One 
of the first implementations of fast integer addition using carry-lookahead was 
described by Weinberger and Smith [11]. However, the operation of the circuit 
seemed to rely on the particularities of carry propagation. It was only 20 years 
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later that Ladner and Fischer formulated the abstract problem of prefix com- 
putation and showed that carry computation is an instance of this class [9] . In 
fact, they showed the more general result that any finite-state transducer can be 
simulated in logarithmic time using a parallel prefix circuit. 

As an aside, the idea underlying this proof is particularly appealing to func- 
tional programmers as it relies on currying. Let <j> :: (A, A) — > A be an arbitrary 
binary operation not necessarily associative. To compute the value of 

4> (Zl A (®2, ■■■ 4> ( X n , a) . . .)) 

and all of the intermediate results we rewrite the expression into a form suitable 
for a prefix computation 

( curry <j> x\ • curry (f> x 2 curry <j> x n ) a 

The underlying binary operation is then simply function composition. An im- 
plementation in hardware additionally requires that the elements of A — > A can 
be finitely represented (see Section 2.1). 

Fich later proved that the Ladner-Fischer family of scans is depth-optimal 
[10]. Furthermore, he improved the design for an extra depth of one. Since then 
various other families have been proposed taking into account restrictions on 
depth and, in particular, on fan-out. Lin and Hsiao, for instance, describe a 
family of size-optimal scans with a fan-out of 4 and a small depth. One main 
ingredient is the circuit wl introduced in Section 6.1. The construction is given 
as an algorithm that transforms an explicit graph representing wl n into a graph 
representing wl n + 1 . The transformation essentially implements the rule 

(h -i~ ui) \ (l 2 -i~ u 2 ) = {[h,h] >~ scari 2 ) -r~ {fan 2 [mi,w 2 ]) 

However, since the graph representation is too concrete, the algorithm is hard 
to understand and even harder to prove correct. 

There are a few papers that deal with the derivation of parallel prefix cir- 
cuits. Misra [12] calculates the Brent-Kung circuit via the data structure of pow- 
erlists 1 . Since powerlists capture the recursive decomposition of Brent-Kung, the 
approach while elegant is not easily applicable to other implementations of scans. 
In a recent pearl, O’Donnell and Riinger [13] derive the recursive implementation 
using the digital circuit description language Hydra. The resulting specification 
contains all the necessary information to simulate or fabricate a circuit. 

The parallel prefix computation also serves as a building block of parallel 
programming. We have already noted in the introduction that many algorithms 
can be conveniently expressed in terms of scans [3]. Besides encouraging well- 
structured programming this coarse-grained approach to parallelism allows for 
various program optimizations. Gorlach and Lengauer [14], for instance, show 
that a composition of two scans can be transformed into a single scan. The scan 
function itself is an instance of a so-called list homomorphism. For this class of 

1 Misra actually claims to derive the Ladner-Fischer scheme. However, the function 
presented in the paper implements Brent-Kung - recall in this respect that If 00 
specializes to bk. 
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functions, parallel programs can be derived in a systematic manner [15]. Apply- 
ing the approach of [15] to scan yields the optimal hypercube algorithm. This 
algorithm can be seen as a clocked circuit. Consequently, there is no direct corre- 
spondence to any of the algorithms given here, which are purely combinatorial. 



8 Conclusion 

This paper shows that parallel prefix circuits enjoy a surprisingly rich algebra. 
The algebraic approach has several benefits: it allows us to specify scans in a 
readable and concise way, to prove them correct, and to derive new designs. In 
the process of preparing the paper the algebra of scans has undergone several 
redesigns. We hope that the final version presented here will stand the test of 
time. 
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Abstract. Exceptions are an important feature of modern programming 
languages, but their compilation has traditionally been viewed as an ad- 
vanced topic. In this article we show that the basic method of compiling 
exceptions using stack unwinding can be explained and verified both sim- 
ply and precisely, using elementary functional programming techniques. 
In particular, we develop a compiler for a small language with exceptions, 
together with a proof of its correctness. 



1 Introduction 

Most modern programming languages support some form of programming with 
exceptions , typically based upon a primitive that abandons the current com- 
putation and throws an exception, together with a primitive that catches an 
exception and continues with another computation [3,16,11,8]. In this article 
we consider the problem of compiling such exception primitives. 

Exceptions have traditionally been viewed as an advanced topic in compi- 
lation, usually being discussed only briefly in courses, textbooks, and research 
articles, and in many cases not at all. In this article, we show that the basic 
method of compiling exceptions using stack unwinding can in fact be explained 
and verified both simply and precisely, using elementary functional programming 
techniques. In particular, we develop a compiler for a small language with excep- 
tions, together with a proof of its correctness with respect to a formal semantics 
for this language. Surprisingly, this appears to be the first time that a compiler 
for exceptions has been proved to be correct. 

In order to focus on the essence of the problem and avoid getting bogged 
down in other details, we adopt a particularly simple language comprising just 
four components, namely integer values, an addition operator, a single excep- 
tional value called throw, and a catch operator for this value. This language 
does not provide features that are necessary for actual programming, but it does 
provide just what we need for our expository purposes in this article. In partic- 
ular, integers and addition constitute a minimal language in which to consider 
computation using a stack, and throw and catch constitute a minimal extension 
in which such computations can involve exceptions. 

Our development proceeds in three steps, starting with the language of in- 
teger values and addition, then adding throw and catch to this language, and 
finally adding explicit jumps to the virtual machine. Starting with a simpler 
language allows us to introduce our approach to compilation and its correctness 
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without the extra complexity of exceptions. In turn, deferring the introduction 
of jumps allows us to introduce our approach to the compilation of exceptions 
without the extra complexity of dealing with jump addresses. 

All the programs in the article are written in Haskell [11], but we only use the 
basic concepts of recursive types, recursive functions, and inductive proofs, what 
might be termed the “holy trinity” of functional programming. An extended 
version of the article that includes the proofs omitted in this conference version 
for reasons of space is available from the authors’ home pages. 



2 Arithmetic Expressions 



Let us begin by considering a simple language of arithmetic expressions, built 
up from integers using an addition operator. In Haskell, the language of such 
expressions can be represented by the following type: 

data Expr = Val Int \ Add Expr Expr 

The semantics of such expressions is most naturally given denotationally [14], 
by defining a function that evaluates an expression to its integer value: 

eval :: Expr — -> Int 

eval ( Val n) = n 

eval ( Add x y) = eval x + eval y 

Now let us consider compiling arithmetic expressions into code for execution 
using a stack machine, in which the stack is represented as a list of integers, and 
code comprises a list of push and add operations on the stack: 

type Stack = [Int] 
type Code = [Op] 
data Op = PUSH Int \ ADD 

For ease of identification, we always use upper-case names for machine opera- 
tions. Functions that compile an expression into code, and execute code using 
an initial stack to give a final stack, can now be defined as follows: 



comp 

comp ( Val n ) 
comp ( Add x y) 
exec 
exec s [ ] 

exec s ( PUSH n : ops) 
exec (m : n : s) ( ADD : ops) 



:: Expr — > Code 

= [PUSH n] 

= comp x -H- comp y -H- [ADD] 
:: Stack — » Code — ■> Stack 
= s 

= exec (n : s) ops 
= exec (n + m : s) ops 



For simplicity, the function exec does not consider the case of stack underflow, 
which arises if the stack contains less than two integers when executing an add 
operation. We will return to this issue in the next section. 




Compiling Exceptions Correctly 213 



3 Compiler Correctness 

The correctness of our compiler for expressions with respect to our semantics 
can be expressed as the commutativity of the following diagram: 



Expr — 5 - Int 



comp 



Code Stack 

exec [J 

That is, compiling an expression and then executing the resulting code using an 
empty initial stack gives the same final stack as evaluating the expression and 
then converting the resulting integer into a singleton stack: 

exec [] ( comp e) = [eval e ] 

In order to prove this result, however, it is necessary to generalise from the empty 
initial stack to an arbitrary initial stack. 

Theorem 1 (compiler correctness). 

exec s ( comp e) = eval e : s 
Proof. By induction on e :: Expr. 



Case: e = Val n 

exec s ( comp ( Val n)) 

= { definition of comp } 

exec s [PUSH n] 

= { definition of exec } 

n : s 

= { definition of eval } 

eval ( Val n ) : s 

Case: e = Add x y 

exec s ( comp ( Add x y)) 

= { definition of comp } 

exec s ( comp x -H- comp y -H- [ADD]) 

= { execution distributivity (lemma 1) } 

exec (exec s ( comp x)) ( comp y 4f [ADD]) 
= { induction hypothesis } 

exec ( eval x : s) ( comp y 4f [ADD]) 

= { execution distributivity } 

exec (exec (eval x : s) (comp y)) [ADD] 

= { induction hypothesis } 
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exec ( eval y : eval x : s) [ADD] 

= { definition of exec } 

eval x + eval y : s 
= { definition of eval } 

eval ( Add x y) : s 

□ 

Note that without first generalising the result, the second induction hypothesis 
step above would be invalid. The distribution lemma is as follows. 

Lemma 1 (execution distributivity). 

exec s ( xs 4f ys) = exec (exec s xs) ys 

That is, executing two pieces of code appended together gives the same result 
as executing the two pieces of code separately in sequence. 

Proof. By induction on xs :: Code. 

When performing an addition in this proof, the stack not containing at least two 
integers corresponds to a stack underflow error. In this case, the equation to be 
proved is trivially true, because the result of both sides is undefined (_L), provided 
that we assume that exec is strict in its stack argument ( exec T ops = _L.) 
This extra strictness assumption could be avoided by representing and managing 
stack underflow explicitly, rather than implicitly using J_. In fact, however, both 
lemma 1 and its consequent underflow issue can be avoided altogether by further 
generalising our correctness theorem to also consider additional code. 

Theorem 2 (generalised compiler correctness). 

exec s ( comp e 4f ops) = exec ( eval e : s) ops 

That is, compiling an expression and then executing the resulting code appended 
together with arbitrary additional code gives the same result as pushing the value 
of the expression to give a new stack, which is then used to execute the additional 
code. Note that with s = ops = [], theorem 2 simplifies to exec [] ( comp e) = 
[eval e], our original statement of compiler correctness. 

Proof. By induction on e :: Expr. 

Case: e = Val n 

exec s ( comp ( Val n) 4f ops) 

= { definition of comp } 

exec s ([PUSH n] -H- ops) 

= { definition of exec } 

exec (n : s) ops 
= { definition of eval } 

exec (eval (Val n) : s) ops 
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Case: e = Add x y 

exec s ( comp ( Add x y) - H- ops) 

= { definition of comp } 

exec s ( comp x -H- comp y 4f [ADD] -H- ops) 

= { induction hypothesis } 

exec ( eval x : s) ( comp y -if [ADD] 4f ops) 

— { induction hypothesis } 

exec ( eval y : eval x : s) ([ADD] -if ops) 

= { definition of exec } 

exec ( eval x + eval y : s) ops 
= { definition of eval } 

exec ( eval (Add x y) : s) ops 

□ 

In addition to avoiding the problem of stack underflow, the above proof also has 
the important benefit of being approximately one third of the combined length 
of our previous two proofs. As is often the case in mathematics, generalising a 
theorem in the appropriate manner can considerably simplify its proof. 



4 Adding Exceptions 



Now let us extend our language of arithmetic expressions with simple primitives 
for throwing and catching an exception: 

data Expr = . . . | Throw \ Catch Expr Expr 

Informally, Throw abandons the current computation and throws an exception, 
while Catch x h behaves as the expression x unless it throws an exception, 
in which case the catch behaves as the handler expression h. To formalise the 
meaning of these new primitives, we first recall the Maybe type: 

data Maybe a = Nothing \ Just a 

That is, a value of type Maybe a is either Nothing , which we think of as an 
exceptional value, or has the form Just x for some x of type a, which we think of 
as normal value [15]. Using this type, our denotational semantics for expressions 
can now be rewritten to take account of exceptions as follows: 



eval 

eval ( Val n) 
eval (Add x y) 



eval (Throw) 
eval ( Catch x h) 



Expr — * Maybe Int 
Just n 

case eval x of 

Nothing — > Nothing 
Just n — > case eval y of 
Nothing — > Nothing 
Just m — > Just (n + m) 
Nothing 
case eval x of 

Nothing — ■> eval h 
Just n — » Just n 
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Note that addition propagates an exception thrown in either argument. By ex- 
ploiting the fact that Maybe forms a monad [17], the above definition can be 
expressed more abstractly and concisely using monadic syntax [12]: 

eval :: Expr — » Maybe Int 

eval ( Val n) = return n 

eval ( Add x y) = do n eval x 

m <— eval y 
return (n + m) 
eval {Throw) = mzero 

eval ( Catch x h) = eval x l mplus l eval h 

For the purposes of proofs, however, we use our non-monadic definition for eval. 
To illustrate our new semantics, here are a few simple examples: 



eval {Add {Val 2) {Val 3)) 


= Just 5 


- no exceptions 


eval {Add Throw {Val 3)) 


= Nothing 


- uncaught exception 


eval {Catch {Val 2) {Val 3)) 


= Just 2 


- unused handler 


eval {Catch Throw {Val 3)) 


= Just 3 


- caught exception 



Now let us consider how the exception primitives can be compiled. First of 
all, we introduce three new machine operations: 

data Op = ... | THROW \ MARK Code \ UNMARK 

Informally, THROW throws an exception, MARK pushes a piece of code onto 
the stack, while UNMARK pops such code from the stack. Using these opera- 
tions, our compiler for expressions can now be extended as follows: 

comp {Throw) = [THROW] 

comp {Catch x h) = [MARK {comp h)] 4F comp x -H- [ UNMARK] 

That is, Throw is compiled directly to the corresponding machine operation, 
while Catch x h is compiled by first marking the stack with the compiled code 
for the handler h, then compiling the expression to be evaluated x, and finally 
unmarking the stack by removing the handler. In this way, the mark and unmark 
operations delimit the scope of the handler h. to the expression x, in the sense that 
the handler is only present on the stack during execution of the expression. Note 
that the stack is marked with the actual compiled code for the handler, rather 
than some form of pointer to the code as would be used in a real implementation. 
We will return to this issue later on in the article. 

Because the stack can now contain handler code as well as integer values, the 
type for stacks must itself be rewritten: 

type Stack = [Item] 

data Item = VAL Int \ HAN Code 




Compiling Exceptions Correctly 217 



In turn, our function that executes code is now rewritten as follows: 



exec :: Stack — > Code — > Stack 

exec s [] = s 

exec s ( PUSH n : ops) = exec ( VAL n : s) ops 

exec s ( ADD : ops) = case s of 

( VAL m : VAL n : s') — > exec ( VAL (n + m) : s') ops 
exec s ( THROW : ops) = unwind s ( skip ops) 
exec s ( MARK ops' : ops) = exec ( HAN ops' : s) ops 

exec s ( UNMARK : ops) = case s of 

(x : HAN _ : s') — » exec (x : s') ops 

That is, push and add are executed as previously, except that we must now take 
account of the fact that values on the stack are tagged. For execution of a throw, 
there are a number of issues to consider. First of all, the current computation 
needs to be abandoned, which means removing any intermediate values that 
have been pushed onto the stack by this computation, as well as skipping any 
remaining code for this computation. And secondly, the current handler code 
needs to be executed, if there is any, followed by any code that remains after 
the abandoned computation. The function exec implements these ideas using 
an auxiliary function unwind that pops items from the stack until a handler is 
found, at which point the handler is executed followed by the remaining code, 
which is itself produced using a function skip that skips to the next unmark: 



unwind 
unwind [] _ 

unwind ( VAL _ : s) ops 
unwind ( HAN ops' : s) ops 
skip 
skip [] 

skip ( UNMARK : ops) 
skip ( MARK _ : ops) 
skip (_ : ops) 



:: Stack — > Code — » Stack 

= H 

= unwind s ops 
= exec s ( ops' -H- ops) 

:: Code — > Code 

= H 

= ops 

= skip ( skip ops) 

= skip ops 



Note that unwind has the same type as exec , and can be viewed as an alternative 
mode of this function for the case when the virtual machine is in the process 
of handling an exception. For simplicity, unwind returns the empty stack in the 
case of an uncaught exception. For a language in which the empty stack was 
a valid result, a separate representation for an uncaught exception would be 
required. Note also the double recursion when skipping a mark, which reflects 
the fact that there may be nested mark/unmark pairs in the remaining code. 

Returning to the remaining cases in the definition of exec above, a mark is 
executed simply by pushing the given handler code onto the stack, and dually, 
an unmark by popping this code from the stack. Between executing a mark and 
its corresponding unmark, however, the code delimited by these two operations 
will have pushed its result value onto the stack, and hence when the handler 
code is popped it will actually be the second-top item. 
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To illustrate our new compiler and virtual machine, their behaviour on the 
four example expressions from earlier in this section is shown below, in which 
the symbol $$ denotes the result of the last compilation: 



comp ( Add ( Val 2) ( Val 3)) 
exec [] $$ 

comp ( Add Throw ( Val 3)) 
exec [] $$ 

comp ( Catch ( Val 2) ( Val 3)) 
exec [] $$ 

comp ( Catch Throw ( Val 3)) 
exec [] $$ 



[PUSH 2, PUSH 3 ,ADD] 
[VAL 5] 

[ THROW , PUSH 3, ADD] 

[} 



[MARK [PUSH 3], PUSH 2, UNMARK ] 
[VAL 2] 

[MARK [PUSH 3], THROW , UNMARK] 
[VAL 3] 



5 Compiler Correctness 



Generalising from the examples in the previous section, the correctness of our 
new compiler is expressed by the commutativity of the following diagram: 



Expr LVal > Maybe Int 



comp 



conv 



Code — Stack 

exec [J 



That is, compiling an expression and then executing the resulting code using an 
empty initial stack gives the same final stack as evaluating the expression and 
then converting the resulting semantic value into the corresponding stack, using 
an auxiliary function conv that is defined as follows: 

conv :: Maybe Int — > Stack 

conv Nothing — 

conv ( Just n) = [ VAL n\ 

As previously, however, in order to prove this result we generalise to an arbitrary 
initial stack and also consider additional code, and in turn rewrite the function 
conv to take account of these two extra arguments. 

Theorem 3 (compiler correctness). 

exec s ( comp e 4f ops ) = conv s ( eval e ) ops 

where 



conv 

conv s Nothing ops 
conv s ( Just n) ops 



Stack — » Maybe Int — > Code — » Stack 
unwind s ( skip ops) 
exec ( VAL n : s) ops 
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Note that with s = ops = [], this theorem simplifies to our original statement 
of correctness above. The right-hand side of theorem 3 could also be written 
as exec s ( conv ( eval e) : ops) using a simpler version of conv with type 
Maybe Int — ■> Op , but the above formulation leads to simpler proofs. 

Proof. By induction on e :: Expr. 

Case: e = Val n 

exec s ( comp ( Val n) - H- ops) 

= { definition of comp } 

exec s ([PUSH n] -H- ops) 

= { definition of exec } 

exec ( VAL n : s) ops 
= { definition of conv } 

conv s ( Just n) ops 
= { definition of eval } 

conv s ( eval ( Val n)) ops 

Case: e = Throw 

exec s ( comp Throw -H- ops) 

= { definition of comp } 

exec s ([THROW] 4f ops) 

= { definition of exec } 

unwind s (skip ops) 

= { definition of conv } 

conv s Nothing ops 
= { definition of eval } 

conv s (eval Throw) ops 

Case: e = Add x y 

exec s (comp (Add x y) 4f ops) 

= { definition of comp } 

exec s (comp x -H- comp y 4f [ADD] -H- ops) 

= { induction hypothesis } 

conv s (eval x) (comp y -H- [ADD] 4fi ops) 

= { definition of conv } 

case eval x of 

Nothing — » unwind s (skip (comp y 4f [ADD] 4f ops)) 

Just n — * exec ( VAL n : s) (comp y -ft- [ADD] 4f ops) 
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The two possible results from this expression are simplified below. 



unwind, s ( skip ( comp y 4f [ADD] -if ops)) 
{ skipping compiled code (lemma 2) } 
unwind s ( skip ([ADD] -H- ops)) 

{ definition of skip } 
unwind s ( skip ops) 



exec ( VAL n : s) ( comp y 4f [ADD] -H- ops) 

= { induction hypothesis } 

conv ( VAL n : s) ( eval y) ([ADD] 4f ops) 

= { definition of conv } 

case eval y of 

Nothing — * unwind (VAL n : s) (skip ([ADD] 4f ops)) 
Just m — » exec (VAL m : VAL n : s) ([ADD] 4f ops) 

= { definition of unwind , skip and exec } 

case eval y of 

Nothing — ■> unwind s (skip ops) 

Just m — ■> exec ( VAL (n + m) : s) ops 

We now continue the calculation using the two simplified results. 

case eval x of 

Nothing — » unwind s (skip ops) 

Just n — » case eval y of 

Nothing unwind s (skip ops) 

Just m — » exec ( VAL (n + m) : s) ops 
= { definition of conv } 

case eval x of 

Nothing — » conv s Nothing ops 
Just n — » case eval y of 

Nothing — » conv s Nothing ops 
Just m — > conv s (Just (n + m)) ops 
= { distribution over case } 

conv s (case eval x of 
Nothing — » Nothing 
Just n — > case eval y of 
Nothing — > Nothing 
Just m — > Just (n + m)) ops 
= { definition of eval } 

conv s (eval (Add x y)) ops 
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Case: e = Catch x h 

exec s ( comp ( Catch x h) 41- ops) 

= { definition of comp } 

exec s ([MARK ( comp h)] 4f comp x 4f [ UNMARK ] 4f ops ) 

= { definition of exec } 

exec ( HAN ( comp h) : s) ( comp x 4f [ UNMARK ] 4f ops) 

= { induction hypothesis } 

conv (HAN (comp h) : s) (eval x) ([UNMARK] 4f ops) 

= { definition of conv } 

case eval x of 

Nothing — * unwind (HAN (comp h) : s) (skip ([UNMARK] 4f ops)) 
Just n — > exec (VAL n : HAN (comp h) : s) ([UNMARK] 4f ops) 

= { definition of unwind , skip and exec } 

case eval x of 

Nothing — » exec s (comp h 4f ops) 

Just n — > exec ( VAL n : s) ops 
= { induction hypothesis } 

case eval x of 

Nothing — ■> conv s (eval h) ops 
Just n — > exec ( VAL n : s) ops 
= { definition of conv } 

case eval x of 

Nothing — > conv s (eval h) ops 
Just n — > conv s (Just n) ops 
= { distribution over case } 

conv s (case eval x of 
Nothing — > eval h 
Just n — > Just n) ops 
= { definition of eval } 

conv s (eval (Catch x h)) ops 

□ 

The two distribution over case steps in the above proof rely on the fact that 
conv is strict in its semantic value argument (conv s _L ops = _L), which is 
indeed the case because conv is defined by pattern matching on this argument. 
The skipping lemma used in the above proof is as follows. 

Lemma 2 (skipping compiled code). 

skip (comp e 4f ops) = skip ops 

That is, skipping to the next unmark in compiled code followed by arbitrary 
additional code gives the same result as simply skipping the additional code. 
Intuitively, this is the case because the compiler ensures that all unmarks in 
compiled code are matched by preceding marks. 



Proof. By induction on e :: Expr. 
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6 Adding Jumps 

Now let us extend our virtual machine with primitives that allow exceptions to 
be compiled using explicit jumps, rather than by pushing handler code onto the 
stack. First of all, we introduce three new machine operations: 

data Op = ... | MARK Addr \ LABEL Addr \ JUMP Addr 

Informally, MARK pushes the address of a piece of code onto the stack (replacing 
our previous mark operator that pushed code itself), LABEL declares an address, 
and JUMP transfers control to an address. Addresses themselves are represented 
simply as integers, and we ensure that each address is declared at most once by 
generating addresses in sequential order using a function fresh: 

type Addr = Int 

fresh :: Addr — » Addr 

fresh a = a + 1 

Our compiler for expressions is now extended to take the next fresh address as 
an additional argument, and is rewritten in terms of another function compile 
that also returns the next fresh address as an additional result: 

:: Addr — » Expr — » Code 
= fst ( compile a e ) 

Addr — » Expr — * (Code, Addr) 

([PUSH n], a) 

(xs 4f ys 4f [ADD], c) 

where 

(xs, b) = compile a x 
(ys, c) = compile b y 
([THROW], a) 

([MARK a] 4f xs -H- [ UNMARK, JUMP b, 
LABEL a] -H- hs 4f [LABEL b], e) 
where 

b = fresh a 
c = fresh b 
(xs, d) = compile c x 
(hs, e) = compile d h 

Note that integer values, addition, and throw are compiled as previously, except 
that the next fresh address is threaded through, while Catch x h is now compiled 
to the following sequence, in which a: abbreviates LABEL a: 

MARK a 

compiled code for x 
UNMARK 
JUMP b 

a: compiled code for h 
b: rest of the code 



comp 
comp a e 

compile :: 

compile a ( Val n) = 

compile a (Add x y) = 



compile a (Throw) = 
compile a (Catch x h) = 
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That is, Catch x h is now compiled by first marking the stack with the address of 
the compiled code for the handler h, compiling the expression to be evaluated x, 
then unmarking the stack by removing the address of the handler, and finally 
jumping over the handler code to the rest of the code. 

By exploiting the fact that the type for compile can be expressed using a 
state monad [17], the above definition can also be expressed more abstractly 
and concisely using monadic syntax. As with the function eval , however, for the 
purposes of proofs we use our non-monadic definition for compile. 

Because the stack can now contain handler addresses rather than handler 
code, the type for stack items must be rewritten: 

data Item = VAL Int \ HAN Addr 
In turn, our function that executes code requires four modifications: 



exec s ( THROW : ops) 
exec s ( MARK a : ops) 
exec s ( LABEL _ : ops) 
exec s ( JUMP a : ops) 



unwind s ops 

exec ( HAN a : s) ops 

exec s ops 

exec s ( jump a ops) 



For execution of a throw, the use of explicit jumps means that the function skip 
is no longer required, and there are now only two issues to consider. First of all, 
the current computation needs to be abandoned, by removing any intermediate 
values that have been pushed onto the stack by this computation. And secondly, 
the current handler needs to be executed, if there is any. Implementing these 
ideas requires modifying one line in the definition of unwind'. 



unwind ( HAN a : s) ops = exec s ( jump a ops) 

That is, once the address of a handler is found, the handler is executed using a 
function jump that transfers control to a given address: 



jump :: Addr — •> Code — > Code 

jump _ [] = [] 

jump a ( LABEL b : ops) = if a == b then ops else jump a ops 

jump a (_ : ops) = jump a ops 

Note that our language only requires forward jumps. If backward jumps were 

also possible, a slightly generalised virtual machine would be required. 

Returning to the remaining modified cases in the definition of exec above, a 
mark is executed simply by pushing the given address onto the stack, a label is 
executed by skipping the label, and a jump is executed by transferring control 
to the given address using the function jump defined above. 

The behaviour of our new compiler and virtual machine on the four example 
expressions from earlier in the article is shown below: 
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comp 0 ( Add ( Val 2) ( Val 3)) 
exec [] $$ 

comp 0 (Add Throw (Val 3)) 
exec [] $$ 

comp 0 (Catch (Val 2) (Val 3)) 
exec [] $$ 



[PUSH 2, PUSH 3 ,ADD] 

[ VAL 5] 

[ THROW , PUSH 3, ADD] 

[] 



[MARK 0, PUSH 2, UNMARK, JUMP 1, 
LABEL 0, PUSH 3, LABEL 1] 

[ VAL 2] 



comp 0 (Catch Throw (Val 3)) 
exec [] $$ 



[MARK 0, THROW , UNMARK , JUMP 1, 
LABEL 0, PUSH 3, LABEL 1] 

[ VAL 3] 



Note that our compiler now once again produces “flat” code, in contrast to our 
previous version, which produced tree-structured code. 



7 Compiler Correctness 

The correctness of our new compiler is expressed by the same diagram as the 
previous version, except that new compiler takes the next fresh address as an 
additional argument, for which we supply zero as the initial value: 



Expr eval > ■ Maybe Int 



comp 0 



conv 



Code Stack 

exec [J 



For the purposes of proofs we once again generalise this result to an arbitrary 
initial stack and additional code, and extend the function conv accordingly. In 
turn, we also generalise to an arbitrary initial address that is fresh with respect 
to the initial stack, using a predicate isFresh that decides if a given address is 
greater than every address that occurs in a stack. 

Theorem 4 (compiler correctness). 



If isFresh a s then exec s (comp a e -H- ops) = conv s (eval e) ops 



where 

isFresh 
isFresh _ [] 
isFresh a ( VAL _ : s) 
isFresh a (HAN b : s) 
conv 

conv s Nothing ops 
conv s (Just n) ops 



Addr — > Stack — > Bool 
True 

isFresh a s 

a > b A isFresh a s 

Stack — > Maybe Int — » Code — > Stack 

unwind s ops 

exec ( VAL n : s) ops 
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Proof. By induction on e :: Expr in a similar manner to theorem 3, except that 
five lemmas concerning fresh addresses are required: 

Lemma 3 (unwinding operators). 

If op = LABEL a => isFresh a s then 

unwind s (op : ops ) = unwind s ops 

That is, when unwinding the stack the first operator in the code can be discarded, 
provided that it is not an address that may occur in the stack. 

Proof. By induction on s :: Stack. 

Lemma 4 (unwinding compiled code). 

If isFresh a s then unwind s ( comp a e 4f ops) = unwind s ops 

That is, unwinding the stack on compiled code followed by arbitrary additional 
code gives the same result as simply unwinding the stack on the additional code, 
provided that the initial address for the compiler is fresh for the stack. 

Proof. By induction on e :: Expr, using lemma 3 above. 

Lemma 5 ( isFresh is monotonic). 

If a ^ b A isFresh a s then isFresh b s 

That is, if one address is at most another, and the first is fresh with respect to 
a stack, then the second is also fresh with respect to this stack. 

Proof. By induction on s :: Stack. 

Lemma 6 ( compile is non-decreasing). 

snd ( compile a e) ^ a 

That is, the next address returned by the compiler will always be greater than 
or equal to the address supplied as an argument. 

Proof. By induction on e :: Expr. 

Lemma 7 (jumping compiled code). 

If a < b then jump a ( comp be -IT ops) = jump a ops 

That is, jumping to an address in compiled code followed by arbitrary additional 
code gives the same result as simply jumping in the additional code, provided 
that the jump address is less than the initial address for the compiler. 

Proof. By induction on e :: Expr, using lemma 6 above. 
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8 Further Work 

We have shown how the compilation of exceptions using stack unwinding can 
be explained and verified in a simple manner. In this final section we briefly 
describe a number of possible directions for further work. 

— Mechanical verification. The correctness of our two compilers for exceptions 
has also been verified mechanically. In particular, theorem 3 was verified in 
Lego by McBride [6], and theorem 4 in Isabelle by Nipkow [10]. A novel 
aspect of the Lego verification that merits further investigation is the use 
of dependent types to precisely capture the stack demands of the virtual 
machine operations (e.g. add requires a stack with two integers on the top), 
which leads to a further simplification of our correctness proof, at the expense 
of requiring a more powerful type system. 

— Modular compilers. Inspired by the success of using monads to define the 
clenotational semantics of languages in terms of the semantics of individual 
features [9], similar techniques are now being applied to build compilers in 
a modular manner [4]. To date, however, this work has not considered the 
compilation of exceptions, so there is scope for trying to incorporate the 
present work into this modular framework. 

— Calculating the compiler. Rather than first defining the compiler and virtual 
machine and then proving their correctness with respect to the semantics, 
another approach would be to try and calculate the definition of these func- 
tions starting from the compiler correctness theorem itself [7,1], with the 
aim of giving a systematic discovery of the idea of compiling exceptions 
using stack unwinding, as opposed to a post-hoc verification. 

— Generalising the language. Arithmetic expressions with exceptions served as 
a suitable language for the purposes of this article, but it is important to 
explore how our approach can be scaled to more expressive languages, such 
as a simple functional or imperative language, to languages with more than 
one kind of exception and user-defined exceptions, and to other notions of 
exception, such as imprecise [13] and asynchronous [5] exceptions. 

— Compiler optimisations. The basic compilation method presented in this ar- 
ticle can be optimised in a number of ways. For example, we might rearrange 
the compiled code to avoid the need to jump over handlers in the case of 
no exceptions being thrown, use a separate stack of handler addresses to 
make the process of stack unwinding more efficient, or use a separate table 
of handler scopes to avoid an execution-time cost for installing a handler. It 
would be interesting to consider how such optimisations can be incorporated 
into our compiler and its correctness proof. 
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1 Introduction 

For many years the realm of total functions was considered to be the natural 
domain in which to reason about functional programs [B90]. The limitation of 
this domain is that it can only be used to describe deterministic programs: those 
that deliver only one output for each input. More recently, the development of 
the relational calculus for program derivation [BdM97] has allowed program- 
mers to reason about programs together with their specifications, which may be 
nondeterministic: they may offer a choice of outputs for each input. Specifica- 
tions expressed in this calculus can be manipulated and refined into functions 
that can be translated into an appropriate functional programming language. 
We now propose to go one step further, since the domain of relations can be 
used to describe only angelic or demonic nondeterminism, but not both. 

Angelic nondeterminism occurs when the choice is made by an ‘angel’: it is 
assumed that the angel will choose the best possible outcome. Demonic nonde- 
terminism occurs when the choice is made by a ‘demon’: no assumption can be 
made about the choice made by the demon, so we must be prepared for the worst 
possible outcome. It is well known that both these kinds of behaviour can be 
described in the domain of monotonic predicate transformers [BvW98,Mor98], 
but this is usually associated with the derivation of imperative, rather than 
functional programs. 

An equivalent relational model that has recently been proposed consists of 
up-closed multirelations [R03]. 

Relational equivalents of predicate transformers have been introduced in the 
past ( see [LyV92] for example), but Rewitzky has extended this equivalence 
to show that any monotonic function over a boolean algebra has an alternative 
representation as a multirelation. 

We will show how specifications involving both angelic and demonic nondeter- 
minism can be expressed and manipulated as multirelations. Such specifications 
can be refined until they contain only one kind of nondeterministic choice. Then 
they can be translated into programs in which those remaining choices are made 
interactively by an external agent at run time. 

This is not the first attempt to introduce angelic and demonic nondeter- 
minism into a calculus for the derivation of functional programs. Ward [W94] 
developed a refinement calculus for this purpose, in a similar style to that of 
[BvW98,Mor98], and with a corresponding predicate transformer semantics. 
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However, we argue that multirelations will provide a better model for several 
reasons. First, one of the beauties of the relational calculus (for examples of its 
use see [BaH93], [BdM97], [ScS88]) is that specifications are simply relations: 
there is no separate command language. As such, they can be manipulated by 
all the familiar operations on relations, such as composition and inverse, as well 
as those developed specifically for the derivation of functional programs. In con- 
trast, in the refinement calculus of both [BvW98,Mor98] and [W94] the command 
language is distinct from its predicate transformer semantics. So specifications 
can only be manipulated using laws that were previously derived from the se- 
mantics. This places a burden on the developer to memorise all the laws, and 
can complicate the reasoning about operations as fundamental as composition, 
which can be expressed simply for predicate transformers, but much less so for 
specifications in the command language. One solution to this problem could be 
to represent specifications directly as predicate transformers, but this might be 
counter-intuitive because they model programs in reverse, mapping postcondi- 
tions to preconditions. In contrast, multirelations can be used to relate inputs, 
or initial states, to outputs, or final states, so we propose to blur the distinction 
between multirelations and the specifications they represent. 

As an illustration of our theory, we will give examples demonstrating how 
multirelations can be used to specify and manipulate some games and resource- 
sharing protocols. For example, in a two-player game the choices of our player 
could be modelled by angelic nondeterminism, and those of our opponent by 
demonic nondeterminism. The resulting specification would allow us to reason 
about the game as a whole, and then derive an implementation that achieves 
some goal, such as an optimal strategy for the player. The implementation would 
then consist of a computerised player, who would play according to the strategy 
against a human opponent who could make demonic choices in an attempt to 
defeat the machine. 

2 Nondeterministic Specifications 

2.1 Agents 

We are primarily interested in specifying systems that contain both angelic and 
demonic choice. One way to think of such specifications is as a contract [BvW98] 
between two agents, both of whom are free to make various choices. One of the 
agents represents our interests, but the other may not. In this context, the angelic 
choices are interpreted as those made by our agent, since we assume that he will 
always make the choice that results in the best outcome for us. Conversely, the 
demonic choices are those made by the other agent, and since we have no control 
over them we must be prepared for all possibilities, including the worst. 

For example, the specification of a simple vending machine from a customer’s 
viewpoint might contain an angelic choice of coin insertion and chocolate bar 
selection, followed by a demonic choice of coins returned by the machine as 
change. This can be viewed as a contract where our agent is the customer and 
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the other agent is the machine designer who can choose how the machine gives 
out change. 

Specifications like this, featuring both angelic and demonic nondetermin- 
ism, have been modelled previously using monotonic predicate transformers 
[BvW98,Mor98], but until now there has been no equivalent relational model. 
The problem with the ordinary relational calculus is that only one kind of non- 
determinism can be described at a time, but fortunately this is not the case for 
the multirelational calculus as we shall now see. 



2.2 Multirelations 

Multirelations were introduced in [R03] as an alternative to predicate transform- 
ers for reasoning about programs. As the name suggests, they are to relations 
what multifunctions (relations) are to functions: that is to say, they are relations 
whose target type is a powerset type. 

Definition 1 (multirelation) A multirelation with source A and target B is 
a subset of the cartesian product i x PB. As such, it is a set of ordered pairs 
(, x,X ) where x € A and X C B. 

We will interpret values of the source type as input values to a program, and 
subsets of the target type as predicates on that type (a given value satisfies a 
predicate if and only if it lies within the set). 

Multirelations model programs in a very different way from ordinary rela- 
tions: a relation R relates two values x and y, written x R y, if and only if 
the corresponding program can terminate with output y given input x. In con- 
trast, a multirelation M relates two values x and post , written x M post , if 
and only if the corresponding program can terminate with an output value that 
satisfies the predicate post given input x. Expressed in terms of a contract be- 
tween two agents, this means that our agent can make choices to ensure that 
the output value satisfies post, no matter what choices are made by the other 
agent. The other agent’s goals may not be directly opposed to ours, but since we 
have no influence over the choices he makes we must be prepared for the worst 
possible outcome. Therefore we can only say that a specification can achieve a 
given postcondition from some initial state if our agent can ensure this under all 
circumstances. 

For example, suppose that, given an input value of x, our agent has the choice 
of two predicates {x — 2, x + 2} and {x — 1}. If he chooses the former, then the 
choice of output value is x — 2 or x + 2 and determined by the other agent. In 
the latter case the other agent has no choice and x — 1 is the output value. 

Not every multirelation is suitable for the specification of a contract. Suppose 
that post, post 1 C B are two postconditions such that post C post 1 . Clearly, if 
our agent can make choices to ensure that an output value satisfies post, it must 
also satisfy post'. Consequently, a multirelation can only represent a contract 
between two agents if it is up-closed in the following sense: 
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Definition 2 (up-closure) A multirelation M with source A and target B is 
up-closed if for all x € A and X, Y C B, 

xMXAXCY=>xMY 

We will denote the type of all up-closed multirelations with source A and target 
B by A =4 B. 

We will use the standard lattice theoretic notation [DaP02] to denote the 
upward closure (|) of a set of sets Z £ PPB, or a single set X £ P B: 

] Z = {Y \ (31 : X £ Z : X C Y)j 
T X = { Y | A C Y} 



Before moving on to some examples, we describe how the nondeterminism in 
an up-closed multirelation corresponds to angelic and demonic nondeterminism. 
The key to understanding the choices offered comes from considering strongest 
postconditions [DiS90], since the set of all strongest postconditions of a specifi- 
cation represents the set of all choices offered to our agent. 

Definition 3 (strongest postcondition) Let M : A B, x £ A and post C 
B . Then post is a strongest postcondition of M with respect to x if and only if 

1. x M post 

2. (V post' : post 1 C B : x M post' => {post' post)) 

So, for example, let Int denote the type of all integers, M : Int ^ Int , and 
suppose that for all x : Int and X : P Int 

x M X ({a; - 1} C X V {x - 2, x + 2} C X) 

then our agent can choose between the two strongest postconditions {x — 1} and 
{x — 2, X + 2}. In general, we shall refer to the set of all strongest postconditions 
of a multirelation M with respect to an initial state x by sp(x, M). 

To show how up-closed multirelations can be used to model specifications, 
we will now look at a few simple examples, some deterministic and some nonde- 
terministic. In the former case, our agent is not offered any choice, and neither 
is the other agent, and thus the deterministic specifications are guaranteed to 
satisfy any postcondition which holds for an output value corresponding to the 
given input value. 

Examples 4 

1. Each type A has identity specification £a- A A, where £a (also denoted 
by id a ) represents the set membership relation on subsets of A. So, given an 
input value x £ A, the (sole) strongest postcondition it can achieve is {z}, 
which means that the value x is output. This specification is guaranteed to 
establish all postconditions that are satisfied by x itself. 
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2. For each pair of types A, B , and each value b £ B, the constant specification 
const b : A =$ B is defined by 

const b = A x } {6} 

Here, the strongest postcondition that is satisfied for any input value is {6}, 
ensuring that the value b is output. 

3. Let Bool denote the type of booleans, and let guard : A —> Bool. We define 
the specification {] guard D : A A, which asserts that a given input value 
satisfies guard for all x : A, X : PA by 

x (| guard DA •£=> guard x A x £ X 

So, if x satisfies the guard, this specification will output x, but it will not 
output anything otherwise. 

4. Consider the multirelation A x } {1,2}, for some source type A. Here the 
strongest postcondition for any input value is {1,2}, so our agent has no 
choice and must select the postcondition {1,2}, giving the other agent the 
choice between output values 1 and 2. This is an example of demonic non- 
determinism. 

5. In contrast, now consider the multirelation A x ({ {1} U { {2}), for some 
source type A. Here the strongest postconditions for any input value are {1} 
and {2}, so our agent can always choose between output values 1 and 2, 
with no choice available for the other agent. This is an example of angelic 
nondeterminism . 

The last two of these examples illustrate the following more general characteri- 
sation of multirelations that contain purely demonic or angelic nondeterminism 
respectively: 

Definition 5 Let M : A B, then 

1. M is demonic if, for all x : A, and Z : PP B, 

x M (f)Z) O (V X : X £ Z : x M X) 

2. M is angelic if, for all x : A, and Z : PPB, 

x M ( (J Z) ^ (3 X : X £ Z : x M X) 

Notice that a demonic multirelation has only one strongest postcondition for 
any initial state, which is to be expected since it contains no angelic choice. 
In contrast, an angelic one may have many strongest postconditions, but each 
non-empty one must be a singleton set since it contains no demonic choice. 

We will now introduce some operations on multirelations so that we can show 
how they can be used to model more interesting specifications. 
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2.3 Operations on Multirelations 

All of the definitions that follow are taken directly from [R03] . 



Definition 6 (angelic choice) The angelic choice of two programs M, A : 
A B is the union MAN : A =4 B. 

Given any input value, our agent can ensure that MAN establishes any post- 
condition that could be achieved by either M or N alone, given the same input 
value. 

Example 7 For each pair of types A, B and values a,b : B, we have 

const a U const (i = i x |{a) U 4 x | {!)} 

Here, our agent can choose between the two strongest postconditions {a} and 
{6}. So he can ensure that this specification establishes any postcondition that 
is satisfied by either of the values a or b, depending on what outcome we aim to 
achieve. 

The dual of angelic choice is demonic choice: 

Definition 8 (demonic choice) The demonic choice of two programs M,N : 
A =1 B is the intersection MAN : A B. 

Given any input value, our agent can ensure that MAN establishes any post- 
condition that could be satisfied by both M and A, given the same input value. 

Example 9 For each pair of types A, B and values a,b : B, we have 

const a D const b = A x ] {a} D 4 x | {&} 

= A x | {a, b} 

Our agent has no choice here: this specification will establish any postcondition 
that is satisfied by both a and b. The strongest postcondition it satisfies is {a, b}, 
showing that it will output one of the values a or 6, but we cannot say which. 

Multirelations cannot be composed using ordinary relational composition for 
obvious type reasons. Instead, composition is defined as follows: 

Definition 10 (composition) The composition of two multirelations M : A =4 
B 1 N : B =1 C is denoted by M § N : A C where for all x : A, X : P C 

x (Mg A) X ^ (3 7 ::x M Y A {My : y € Y : y A A)) 

So, given input value x, our agent can only guarantee that M § A will output a 
value that satisfies X if he can ensure that M will establish some intermediate 
postcondition Y and if he can also guarantee that A will establish X given any 
value in Y . 

It is routine to check that the composition operator is associative, with iden- 
tity id a for each A, and preserves up-closure. An equivalent formulation of its 
definition that can be useful in calculations is given below. 
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Lemma 11 Let M : A =$ B and N : B =3 C, then for all x : A, X : P C 

x {M % N) X ^ (3 Z : Z £ sp(x,M) :(\/z : z£ Z : z N X)) 

The proof of this lemma is in the Appendix. We are now in a position to give 
some more interesting examples of specifications. 

Examples 12 

In these examples, we will show how multirelations can be used to model part 
of a simplified version of the game of Nim [BvW98]. For details of the proper 
game, see [BCG01] for example. The simplified game involves two players and a 
pile of matches. The players take it in turns to remove either one or two matches 
from the pile. The loser is the player who removes the last match. So suppose 
that for all n : Int, sub n : Int Int represents the program that subtracts the 
value n from its argument: for all x : Int, X : P Int 

x ( sub n) X 4=> (x — n) £ X 

We will model the choices of our player by angelic nondeterminism and our 
opponent by demonic nondeterminism, so we have 

player : Int =4 Int 
player = sub 1 U sub 2 
opponent : Int =4 Int 
opponent = sub 1 D sub 2 

More explicitly, for all x : Int , X : F Int 

x player X 4=> {i-l}CI V {i-2}Cl’ 
x opponent X 4=> {x — 1, x — 2} C X 

In each case the input value x is an integer which represents the number of 
matches in the pile at the start of that player’s move. For simplicity, we do not 
check that this integer is positive, unlike [BvW98]. 

Although Nim is a symmetric game, the following examples illustrate that 
the guarantees for our player are very different depending on whether he moves 
first or second. 

1. First suppose that our player has the first move, then a round of the game 
is defined by 

round : Int =4 Int 

round = player § opponent 

and we can calculate that for all x : Int , X : P Int 

x round X 

4=> {Definition of round} 
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x ( player g opponent) X 
<t=> {Lemma 11} 

(3 Z : Z € sp{x, player) : (V z : z € Z : z opponent X)) 

{Definition of player} 

(3 Z : Z € {{x — 1}, {x — 2}} : (V z : z £ Z : z opponent X)) 

<t=> {3, range} 

((x — 1) opponent X) V ((x — 2) opponent X) 

{Definition of opponent} 

{x — 2, x — 3} C X V {x — 3, x — 4} C X 

We will use this program to reason about the game of Nim as a whole in 
Section 4. 

2. Now suppose that the opponent has the first move, and so a round of the 
game is now defined by 

newround : Int Int 
newround = opponent § player 

By a similar calculation to that for round we can deduce that for all x : 
Int, X : P Int 

x newround I«{t-2,i-4}CIV{i-3}CI 

This is exactly what we would expect from our intuition about angelic and 
demonic nondeterminism, since the value input to player is either x — 1 or 
x — 2, and so our player can always choose to output a; — 3, or alternatively 
one of x — 2 or x — 4. 

The notation M n will be used to denote the composition of a homogeneous 
multirelation M : A =1 A with itself n times. This operator satisfies the following 
fusion law: 

Lemma 13 Suppose M , N, K : A =4 A, and let n : N + , then 

M%K = K%N A N = K° 9 N => M n % K = N n 

The proof of this lemma is in the Appendix. 

Two more operators that are frequently useful for defining repetition in pro- 
gram specifications are transitive closure and reflexive transitive closure: 

Definition 14 Let M : A =4 A. 

1. The transitive closure of M is denoted by M + : A z4 A, where 

M+ = |J {M n | n > 0} 

2. The reflexive transitive closure of M is denoted by M* : A A, where 

M* = (J {M n | n > 0} 
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These operators satisfy many laws, including the following: 

Lemma 15 Suppose M : A =4 A, then 
M C M+ 

M + C M g M* 

Another pair of operators that are useful in program specifications are the fol- 
lowing well-known functions for lifting relations to multirelations [BvW98]: 

Definitions 16 (lifting) Let R : A — > B, then 

1. the angelic lifting ( R ) : A =3 B is defined for all x : A, X : P B by 

x (R) X <t=> (3 y : x R y : y £ X) 

2. the demonic lifting [1?] : A =4 B is defined for all x : A, X : P B by 

x [R] X (\/ y : x R y : y £ X) 

Note that if R is a total function, then the angelic and demonic liftings coincide. 
These operators are so-called because they map relations to angelic and demonic 
multirelations respectively. Both operators distribute through relational compo- 
sition, which will be denoted by ; . The difference between these operators can 
be illustrated by the game of Nim of Examples 12: 

Example 17 Let x, y : Int and define 

move : Int — > Int 

x move y <t=> y = x — lVy = x — 2 
then an alternative definition of the moves is given by 
player = {move) and opponent = [move] 

We have now seen that the set of up-closed multirelations is closed under 
angelic choice, demonic choice and composition, the latter of which can be used 
to define the familiar transitive closure operations. They are not closed under 
negation or set difference, and the converse operator cannot be defined, but 
union and intersection can be lifted to multirelations from the target set in 
an obvious way. Although negation cannot be lifted directly from relations to 
multirelations, it does have an analogue in the dual operator, which can be 
thought of as performing an allegiance swap, so that our (angelic) agent becomes 
demonic, and the other (demonic) agent becomes the angel: 

Definition 18 (dual) The dual of a specification M : A =4 B is denoted by 
M° : A =4 B where for all x £ A, X £ P B, 

x M° X & -i (x M X) 

where X denotes the complement of X in B. 

The dual of a multirelation corresponds to the conjugate operator on predicate 
transformers [DiS90] . The significance of this operator will become more appar- 
ent after the discussion of correctness in the following section. 
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3 Correctness, Refinement and Feasibility 

There are many different notions of refinement for ordinary relations. For exam- 
ple, two opposing definitions are given in [BvW98], depending on whether the 
relations are being used to model angelic or demonic nondeterminism. In the 
former case, it is defined by subset inclusion, and in the latter by its inverse. 
Since the predicate transformer model of specifications is capable of expressing 
both kinds of nondeterminism within a single framework, it has not previously 
been thought necessary to endow it with more than one refinement ordering. 
Likewise, it has only one notion of correctness and feasibility. However, we ar- 
gue that any specification containing both angelic and demonic nondeterminism 
must offer certain guarantees of behaviour to both the angel and the demon. 
These guarantees can be formalised as two contrasting correctness conditions, 
each of which has an associated refinement ordering and feasibility condition. 
The choice of which one to use depends on whether we see the specification from 
the viewpoint of the angel or the demon. Either way, this choice must be made at 
the start of program development and stay fixed throughout the design process. 

3.1 Correctness 

To demonstrate the need for a second correctness criterion, we consider again 
the following simple example of a multirelation. For x : Int and X : P Int , define 
M : Int Int by 

x M X O {{x - 1} C X V {x - 2, x + 2} C X ) 

Clearly, as the angel can choose between the two strongest postconditions {x — 1} 
and {x—2,x+2}, the angel has some guarantees on the output value. In addition, 
the demon also has some guarantees. For example, he is never obliged to output 
the value a; — 2 if he chooses not to. 

Both these kinds of guarantee are instances of the two more general cor- 
rectness conditions defined below. One requirement of such conditions is that 
they should be monotonic in the following sense: suppose that a specification 
M : A B is correct with respect to an initial state x £ A and postcondi- 
tion post C B , and suppose that post' C B is another postcondition with the 
property that post C post 1 . Then M must also be correct with respect to x and 
post'. 

Angelic Correctness. Correctness for the angel is easily defined in the mul- 
tirelational model. By definition, x M post if and only if the angel can make 
choices to ensure that the output value satisfies post. So we can simply say that 
a specification M : A B is angelically correct with respect to an initial state 
x € A and postcondition post C B if 

x M post (1) 

The upward closure of M ensures that this condition is monotonic. It is equiv- 
alent to the traditional notion of correctness for predicate transformers in the 
refinement calculus [BvW98,Mor98]. 
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Demonic Correctness. Correctness for the demon is more difficult to express 
because of the asymmetry of the multirelational model. Conditions that a demon 
can guarantee to satisfy are given by the following definition of correctness: a 
specification M : A B is demonically correct with respect to an initial state 
x € A and postcondition post C B if 

x M° post (2) 

This means that the angel can never establish the complement of post , thereby 
ensuring that the demon can establish post. It is easy to check that this definition 
of correctness satisfies the monotonicity condition imposed above. 

The following example gives an illustration of both the foregoing correctness 
criteria. 

Example 19 (Cake Cutting Protocol) 

We will look at a well-known two-participant protocol which can be described 
as “I cut; you choose”. 

A cake can be modelled as the unit interval I = [0, 1], and slices of cake as 
sub-intervals of the form [x, y], where 0<a:<j/<l.A single cut dividing the 
cake into two pieces can be modelled by the relation cut : 1 — > /, defined to be 

cut = 1x1 

Thus, given the (null) input, cut may output any cutting position from 0 to 1. 
The selection of one of the two resulting pieces of cake can be modelled by the 
relation choose : I — > I x /, where if p : I and i : I x I 

p choose i •£=> i = [0,p] V i = \p, 1] 

where square brackets have been used instead of the more familiar braces to 
indicate that this pair of values represents an interval. The following cake cutting 
protocol describes our agent cutting the cake first, followed by the other agent 
selecting one of the resulting pieces: 

(cut) § [choose] (3) 

The two agents may have different preferences for cake slices (for example, 
one may prefer chocolate sprinkles and another may like hazelnuts but not sprin- 
kles). So two different functions are used for measuring the value of cake slices. 
The value function for our agent will be denoted by value a , and that for the 
opposing agent by valued • Both functions have type I x I — > /, and they must 
be continuous and additive. This means that if the cake is divided into slices, 
the sum of the values of the slices is the value of the whole cake (defined to be 
1). In particular, we have for each b: 

(V p : p € / : valueb [0,p] + valueb [p, 1] = 1) (4) 

The set of strongest postconditions offered to our agent by the protocol can be 
calculated as 

{{[0, p], [p,l]} I P &I} 

(these describe predicates on the other agent’s cake slice). 



( 5 ) 
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A common aim of a cake cutting protocol is to achieve fair division: for n 
participants, this means that each participant receives a slice that they consider 
to be worth at least 1/n of the whole. So for two participants this means that 
each agent b should be able to achieve the postcondition fain (on the slice chosen 
by the demon agent), where 

fair a = {i \ 3p : p G I : (i = [0,p] V i = [p,l]) A valuedi > 0.5} 
fair a = {i \3p : p G I : (i = [0,p] V i = [p,l]) A value a i < 0.5} 

Thus faira expresses that the demon agent wants his own slice of cake to be at 
least half the whole, and fair a expresses that the angelic agent wants the demon’s 
slice of cake to be no more than half the whole. More precisely, to achieve fair 
division of the cake, the multirelation (cut) g [ choose ] must be angelically correct 
with respect to fair a , and demonically correct with respect to faird- We will now 
show that this is indeed the case. 

First, consider our agent. As value a is continuous, our agent can choose a 
value x such that 

value a [0, x \ = value a [ x , 1] = 0.5 

and thus our agent can achieve the postcondition {[0,x], [x,l]} (5). By upward 
closure and (1) the protocol is therefore angelically correct with respect to fair a . 

Now consider the other agent. By (2), the protocol is demonically correct 
with respect to faird if and only if our agent cannot establish the postcondition 
faird ■ By (4), one of the slices of cake must be worth at least 0.5 to the other 
agent, that is to say, 

(Vp : p G I : valued [0, p] > 0.5 V valued [ P , 1] > 0.5) 

Hence by (5), every postcondition that our agent can establish contains a value 
in faird, and thus we cannot establish faird ■ So the other agent is also able to 
ensure receiving what he considers to be a fair slice of cake. 

3.2 Refinement 

One requirement of a refinement relation is that it should preserve correctness, 
but there are now two possible interpretations of this requirement, correspond- 
ing to the two notions of correctness defined above. Both are valid in different 
circumstances. 

As motivation of this idea, consider an arbitrary specification involving two 
agents. Usually, one of the agents represents an external decision maker of some 
kind, such as a human user of the system; we will refer to this as the external 
agent. The choices offered to this agent could include menu options, passwords, or 
moves in a game. The second agent in a typical specification usually represents 
the system designer; we will refer to this as the internal agent. The choices 
offered to this agent represent deferred design decisions. As such, the designer is 
free to implement them however he chooses. If the system is to be implemented 
in a deterministic language, then ultimately all such choices must be removed 
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completely, or at least hidden so that they depend on information that is not 
visible to the external agent. A refinement of a specification represents a move 
towards such an implementation, so it must consist of a decrease in internal 
choice, but not of external choice, which may even be increased. Either way, 
a refinement must preserve any guarantee of system behaviour that has been 
offered to the external agent by the original specification. So if the external 
agent is angelic, refinement must preserve angelic correctness, and dually if it is 
demonic. We will refer to these two separate concepts as angelic and demonic 
refinement respectively. 

Angelic Refinement. The situation where the external agent is angelic is 
shared by many specifications. For example, consider a customer who is con- 
tracting a programmer to deliver a piece of software to accomplish some task. If 
such a specification includes choices, then it is assumed that the software user 
will make the best (angelic) choice to achieve the desired goal, regardless of any 
(demonic) design choices made by the programmer. This is the interpretation 
of nondeterminism used in the refinement calculus : our agent is external, and 
the other one is internal. So a refinement consists of a possible decrease in de- 
monic choice and a possible increase in angelic choice, and can be defined for all 
M, M' : A =$ B by 

M Qa M' = M C M' ( angelic refinement ) 

Clearly, this preserves the corresponding notion of angelic correctness (1). In- 
tuitively, this definition means that a refinement can only increase the set of 
postconditions that our agent can choose to satisfy and hence makes it easier to 
satisfy the specification. 

Demonic Refinement. A less common situation is that where we are con- 
cerned with developing a program that is as uncooperative as possible with the 
user, who is potentially hostile and might be capable of preventing our program 
from achieving its goal. For example, consider the game of Nim (Examples 12), 
where the external agent is the human opponent and the internal agent is the 
programmer of the computerised player. Here, the goal of the programmer is 
to implement a winning strategy to defeat the human adversary who is free to 
experiment with different strategies each time he plays. Any such implementa- 
tion of the player must still offer some guarantee of behaviour to the user. For 
instance, the player must not remove more than two matches. 

Another kind of specification that must guard against harmful users might 
be a security application such as a password protection mechanism. Here, the 
programmer will make the best choices to achieve a secure system that minimises 
the probability of a successful breach by an attacker. 

In both these examples we assume that if the programmer wishes to achieve 
some outcome then he will implement the angelic choice that does so regardless of 
the choice made by the user. So a refinement must consist of a possible decrease 
in angelic choice and a possible increase in demonic choice: for all M, M' : A =4 B 

M Qd M 1 = M D M' ( demonic refinement) 
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Once again, this preserves the corresponding notion of correctness (2). Here, a 
refinement narrows down the selection of achievable postconditions to exclude 
those that our agent definitely does not want to satisfy. So, for example, one 
refinement of the player’s move in the specification of Nim would be to always 
remove one match if there are only two left in the pile: 

rplayer = (fl (= 2) D g sub 1) U ({] 2) D % player) 

where the notation {] D was introduced in Examples 4. A better refinement of 
the player’s move will be considered in Section 4. 

It is interesting to note that the definitions of angelic and demonic refinement 
given above are identical to those given for ordinary relations in [BvW98], even 
though they are used here for an entirely different purpose. 

3.3 Feasibility 

A specification is feasible if it can be translated into an equivalent implemen- 
tation in the target programming language. So if the language is deterministic, 
a feasible program must be free of all internal choice. It may still contain some 
external choice however, since this can be resolved through user interaction at 
run-time. Therefore, if the external agent is our agent, a specification is feasible 
if and only if it contains only angelic choice, which is to say that it is angelic 
in the sense of Definition 5. Dually, if the external agent is not our agent, a 
specification is feasible if and only if it contains only demonic choice, which is to 
say that it is demonic in the sense of Definition 5. Hence either of the refinement 
rules of Section 3.2 can be used to transform any specification into one that is 
feasible. 

4 Applications 

4.1 Nim 

The version of the Nim considered here was introduced in Examples 12. It in- 
volves two players and a pile of matches. The players take it in turns to remove 
either one or two matches from the pile and the loser is the player who removes 
the last match. The model that follows differs from that in [BvW98] in that the 
number of matches in the pile is allowed to become negative. So a game can be 
modelled as an indefinite number of rounds: 

nim : Int =4 Int 
nim = round* 

where round was defined previously in Examples 12 (1). At each round, our 
player can guarantee to establish one of two postconditions: 

Lemma 20 For all x : Int, n : N + and X : P Int, 

x round n X 4=> {y, y + 1} C X V {y — 1, y} C X 

where y = x — 3 n. 



242 



Clare E. Martin, Sharon A. Curtis, and Ingrid Rewitzky 



The proof of this lemma is in the Appendix. 

Our player can choose to win the game if and only if he can establish the post- 
condition {0,-1}, since then the opponent is forced to remove the last match. 
By Lemma 20, this postcondition can be guaranteed if and only if there exists a 
natural number n for which y is equal to 0 or —1, which is equivalent to saying 
that x must be positive and satisfy x mod 3 1. In either case our player can 

win after p rounds, where 

p = (x — 1) div 3 +1 (6) 

Having established the existence of conditions under which our player can 
achieve a win, we must now refine the specification towards a feasible implemen- 
tation of a winning strategy, assuming that the role of our player will be taken 
by a computer. So, suppose win : Int Int is a refinement of nim : 

nim win 

where the demonic refinement ordering is used here because our agent is the 
internal agent. Starting with x matches, our player is guaranteed to win in the 
refined game win iff for all X : P Int 

x win X 4=> {0, —1} C X 

since this condition asserts that win has a single strongest postcondition, namely 
{0,-1}. We will begin to specify a refinement as follows: 

win = round p g choosewin 

where p was defined in (6) to be the round on which the last match will be 
removed, and choosewin is a specification that selects the winning output from 
that round, so choosewin C id. To see that nim 'Qd win we calculate 

nim 

= {definition of nim} 
round* 

D {definition of *} 
round p 

D {choosewin C id and monotonicity} 
round p g choosewin 

The most obvious way to define choosewin is to use the notation for converting 
guards to multirelations of Examples 4 to define: 

winner : Int — > Bool 
winner x = x £ {0,-1} 
choosewin o = d winner D 

It is not difficult to check that this does indeed implement a winning strategy 
in the sense of (7), but as a practical method it is useless, as it gives no insight 
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into the choice of move that our player must make at each round. Instead we 
need to generalise the definition of choosewin : 

nearwin : Int — > Bool 
nearwin x = x mod 3 yf 1 
choosewin = {] nearwin D 

This implementation still has the effect of filtering out unwanted postconditions, 
but it also has the advantage that it can be fused into the definition of win to 
provide a practical winning strategy for our player. Using Lemma 13, we have 

win = round p § choosewin = ( round § choosewin) p 

since round g choosewin = choosewin g round g choosewin. If we also observe that 
nearwin satisfies the following property 

-» {nearwin x) <t=> {nearwin{x — 1) A nearwin{x — 2)) (7) 

for all x £ N, we may calculate further that 

round g choosewin 
= {definition of round} 
player g opponent g choosewin 
= {definition of choosewin} 
player g opponent g {] nearwin D 
= {composition and (7)} 

player g (] -? nearwin D g opponent 

So we can replace the player’s move by the new, winning move playerwino = 
player g (] -> nearwin D, and hence compute that for all x : Int,X : P Int, 

x playerwino X 
<S=> 

(x mod 3 = 0Ai-2gX)V(i mod 3 = 2 A £ — l£l) 

which is the well known strategy of making the number of matches always satisfy 
the invariant x mod 3 = 1. 

This strategy still has a problem, which is that playerwino is not demonic, 
and hence not feasible in the sense of Section 3.3. The reason it is not demonic is 
that our player fails to make a move in the case that x mod 3 = 1, since he cannot 
guarantee to win in this case. For the refinement to be feasible, player must still 
remove either one or two matches in this case. Suppose for simplicity that player 
chooses a strategy playerwin always to remove one match if x mod 3 = 1: 

x playerwin X <t=> x playerwino X V (x mod 3 = 1 Ai- leX) 

and define 



strategy = ( playerwin g opponent ) p 
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This is still a valid refinement of nim , since player playerwin , and it still 
implements a winning strategy in the sense of (7) if x mod 3 / 1 because 
win C strategy and so by monotonicity 

x win {0, —1} => x strategy {0, —1} 

Since strategy is demonic it has a single strongest postcondition, which must be 

{ 0 ,- 1 }. 



4.2 Team Selection 

The following resource sharing protocol concerns the selection of two sports 
teams from a list of available players. Each team is to have n players, and there 
are more than 2 n available players for the team managers to choose from. 

For this team selection problem, fair division is not always possible. For 
example, if one player was so talented that both team managers would value any 
team containing that player more highly than any team without, then one of the 
managers will end up disappointed. We will consider Dawson’s team selection 
protocol from [Daw97], which does however satisfy correctness conditions that 
are close to fair, as we shall see. The following is a slightly simplified version of 
Dawson’s protocol: 

Step 1 The manager for the first team (Manager 1) selects two (disjoint) teams 
of n players. 

Step 2 Manager 2 selects one of the two teams, giving the other team to Man- 
ager 1. 

Step 3 Beginning with Manager 2, the managers take turns to swap some (pos- 
sibly none) of the players on their own team with some from the pool of 
leftover players. The protocol ends when neither manager wishes to make 
any further swaps. 

Given a type Player , a relation describing the first step of the protocol is 
partition : ¥ Player — > PP Player x ¥ Player, where 

ps partition ({f, t'},pool) 4=> ps = t l±l f' l±l p 

A $ t = ^ t ' = n 

where ps represents the input list of players, and W is disjoint set union. 

The second step of the team selection protocol can be described by the rela- 
tion select : PP Player x Player — > P Player x ¥ Player x ¥ Player, where 

{{t\,t 2 },p) select (ti,t 2 ,p) 

Thus t\ is the team for Manager 1, t 2 is the team for Manager 2, and p is the 
pool of unselected players. 

The swapping described in the third step of the protocol can be defined with 
the two relations 

(fi l±l q, t 2 ,p l±) q') swapi (fi tU q' , t 2 , p l±l q), if = #g' 

(fi, t 2 l±! q,p l±l q') swap 2 (t x ,t 2 \i) q' ,p\t) q), if = #g' 



Modelling Nondeterminism 245 



If we now deem Manager 1 (say) to be the angel, and Manager 2 to be the demon, 
then the description of the protocol can be translated into the multirelation P , 
where 

P = ( partition ) g [ select ] g ([suiap 2 ] g (swapi)) + 

Instead of starting from a specification of some kind and refining it to pro- 
duce the above protocol, here we have simply modelled an existing protocol by 
expressing it in the calculus of multirelations. As discussed above, this protocol 
cannot possibly guarantee a fair division for all input sets of players, but we 
would still like to be able to prove that certain other correctness conditions do 
hold. 

We will use a simplistic model, assuming that each Manager i has a value 
function valuei : Player — > M and values a whole team by the sum of its players’ 
values. It could then be said that the protocol achieves fair division if each 
manager can select a team containing half of his 2 n most highly valued players. 
It turns out that this is achievable for Manager 2, and nearly achievable for 
Manager 1. Defining N 2 to be half the sum of the values of the best 2 n players 
(with respect to value 2 ), this is the (demonic) correctness condition C 2 we shall 
prove that Manager 2 can satisfy: 

@2 (h, h, p) = ( E value 2 q ) > IV 2 

q£t 2 

Turning to the interests of Manager 1, it would be ill-advised to let high value 
players lie unchosen in the pool at the first step of the protocol, because they can 
be swapped into the team of Manager 2 at the first swap. The best Manager 1 
can hope for is to partition the teams from the 2 n players he considers best, 
making the teams as even as possible. Thus, we define N\ to be the largest 
possible sum of a team composed of n of the 2 n best players (with respect to 
value 1 ), such that the other n players have a total value at least as high. Thus 
this is the (angelic) correctness condition that we will use for Manager 1: 

Ci (h ,h,p) = (E value 1 q ) > N\ 

In order to show that the protocol P is correct with respect to C\ it is sufficient to 
show, for some other multirelation P\ such that Pi Qa P, that Pi is correct with 
respect to Ci, because correctness is preserved by refinement. We can calculate 

P 

= {Definition} 

(. partition ) g [ select ] g ([swap 2 ] g ( swapi}) + 

D {Transitive closure, Lemma 15} 

( partition ) g [select] g [swap 2 \ g (swapi) 

Denoting the resulting multirelation by Pi, it is straightforward to see that Pi 
satisfies C}: if Manager 1 partitions the input set of players ps to produce two 
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teams containing his 2 n best players that are as evenly matched as possible, then 
by definition each team will have total value (with respect to value i) at least N\, 
so that ps ((partition) g [select]) C\. Subsequently, [suiap 2 ] cannot change t\, and 
( swapi ) has the option of leaving t\ unchanged, so Manager 1 can guarantee to 
be able to satisfy condition C\. 

For Manager 2, we must look at demonic correctness, and we now perform a 
similar calculation, this time with respect to demonic refinement: 

P 

= {Definition} 

( partition ) g [select] g ([swap 2 ] g (swapi)) + 

C {Closure, Lemma 15} 

(partition) g [select] g [swap 2 \ g (swapi) g ([swap 2 ] g (swapi))* 

= {Lifting distributes through composition} 

(partition) g [select ; swap 2 ] g (swapi) g ([swap 2 \ g (swapi))* 

Denoting the result by P 2 , we thus have that P 2 Qd P, and so if we can prove 
that Manager 2 can guarantee P 2 to satisfy C 2 , then that must also be the 
case for P. The correctness of P 2 is now easier to demonstrate, by considering 
demonic guarantees for [select ; swapo\. 

At the first step, Manager 1 (the angel) has complete control over the parti- 
tioning. To complete the subsequent step, select ; suiap 2 , Manager 2 should first 
note where, within the suggested teams and pool, his 2 n most highly valued 
players are. Suppose that this set of players is represented by the disjoint union 

&i W bpi l±l & 2 l±) bp2 
for player sets satisfying 

&i C t A & 2 Q t' A bpi l±) bp2 C p 
A #&1 + #&P 1 = #&2 + #bp 2 = n 
for teams and pool ({t, t'},p). 

Manager 2 should then choose the most valuable of b\ W bp\ and & 2 W bp 2 
for his team (f 2 ) during his execution of [ select ; swap 2 ]. As the value of the 
players in b\ W bpi l±l 6 2 W &p 2 is 2 A 2 , this offers Manager 2 a guarantee of being 
able to satisfy C 2 , and therefore (partition) g [select] swap 2 ] satisfies the demonic 
correctness condition C 2 . Subsequently, for (swapi) g ([swap 2 ] g (swapi))*, 
a similar argument to that for Manager 1 can be used, as Manager 1 cannot 
guarantee to change f 2 subsequently and Manager 2 always has the option to 
keep t 2 unchanged. Thus Manager 2 can guarantee to satisfy C 2 , and the protocol 
is demonically correct for C 2 . 

5 Conclusion 

This paper consists of some preliminary steps towards the construction of a 
new calculus for modelling nondeterministic specifications as multirelations. The 
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model is appealing because it combines some of the merits of the existing mod- 
els of relations and predicate transformers. Like relations, multirelations model 
specifications from initial to final state, and like predicate transformers, they can 
model two kinds of nondeterminism within a single framework. This is achieved 
by mapping input values to postconditions rather than output values. So the 
multirelational model is more expressive than that of relations, but not quite as 
transparent. Therefore we recommend its use only for the description of systems 
that cannot be expressed more simply using ordinary relations. 

The two forms of nondeterminism in a multirelational specification represent 
two potentially conflicting interests that can be viewed in a number of different 
ways. Typically, we think of one of these interests as representing an angel and 
the other as a demon, where the angel is on our side but the demon is not. Not all 
systems can be uniquely categorised in this way however. For example, in the case 
of a game or resource sharing protocol involving two participants with opposing 
interests, either participant could be modelled as the angel, depending on the 
circumstances. Consequently, it is sometimes useful to associate the different 
interests in a specification with different agents, without necessarily labelling 
them as angelic or demonic. The number of agents involved in such a specification 
is not limited to two, but it can only be modelled as a multirelation if the 
allegiance of each agent is known, with the choices made by our allies treated 
as angelic, and those of our enemies as demonic. This observation is made in 
[BvW98], where the choices of multiple agents are first represented by indexed 
operators, but later categorised as simply angelic or demonic. 

The concept of agents is also useful for distinguishing between the internal 
and external choices in a specification. We use this idea to motivate the need 
for a new notion of refinement: we claim that a refinement is equivalent to a 
reduction in internal choice or an increase in external choice. This leads to two 
opposing definitions of angelic and demonic refinement that are identical to those 
that have been suggested for relations [BvW98]. The choice of which one to use 
must be made at the start of program development and stay fixed throughout 
the design process. We have provided corresponding new notions of correctness 
and feasibility, and demonstrated that refinement is a correctness preserving 
operation. Feasibility is interpreted to mean the absence of internal choice, since 
external choices can remain in the program to be executed interactively at run- 
time. 

Although none of the calculus presented so far is tied exclusively to the deriva- 
tion of functional programs, this is one of the areas we plan to develop in the 
future. The central result that we wish to exploit concerns the extension of ini- 
tial algebras from relations to multirelations [Moo92] . This result was originally 
expressed in the context of predicate transformers, but was difficult to apply in 
practice, because predicate transformers, unlike multirelations, model programs 
in reverse. The reason that we feel that this is an important result is that its 
counterpart for total functions and relations is one of the primary building blocks 
of the relational calculus of [BdM97] : it provides the key to the extension of the 
fold operator on regular datatypes of functional programming to relations. The 
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corresponding extension of fold to multirelations has a very familiar form, unlike 
its predicate transformer equivalent. Associated with this definition is a universal 
property and fusion law that can be used in the calculation of programs. 

Some of the areas that we are currently exploring as applications of this the- 
ory are security protocols, resource-sharing protocols and game theoretic mecha- 
nisms such as voting procedures. We hope to use multirelations to reason about 
such systems and derive implementations that meet various requirements. In 
the case of a game, this could be interpreted as the implementation of a winning 
strategy, and in the case of resource-sharing protocol, it could be a protocol that 
achieves fair division. In practice, few games actually have a winning strategy, 
but there are weaker game theoretic concepts that are desirable in an implemen- 
tation. For example, Pauly [P03] proposes the existence of a subgame equilibrium 
as an alternative notion of correctness. Similarly, there are a number of different 
criteria that are desirable in a resource-sharing protocol. 
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Appendix 

Proof of Lemma 11 

x {M%N) X 
=>■ {Definition 10} 

(3 Y :: x M Y A (V y : y e Y : y N X)) 

=> {Definition 3} 

(3 Y :: (3 Z : Z £ sp(x,M ) : Z C Y A (V y : y € Y : y N X))) 

=> {Logic} 

(3 Z : Z € sp(x, M) : (Vs : z € Z : z N X)) 

Conversely, 

(3 Z : Z € sp(x,M) : (V z : z € Z : z N X)) 

=> {Definition 3} 

(3 Y :: x M Y A (V y : y G Y : y N X)) 

=> {Definition 10} 
x (Mg IV) X 

□ 

Proof of Lemma 13 The proof is by induction on n. For ease of reference, we 
will number the two assumptions in this lemma as follows 

1. M h K — K l N 

2. N = K%N 

Base case. Suppose n = 1, then 
M%K 
= {( 1 )} 

K%N 

= {( 2 )} 

N 
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Inductive step. Suppose that M n § K = N n , then 
M n+1 § K 

= {Definition of M n+1 } 

M n %M%K 

= {( 1 )} 

M n g K gN 

= {Inductive hypothesis} 

N n %N 

= {Definition of N n+1 } 

N n + 1 

□ 

Proof of Lemma 15 That M C M + follows directly from the definition of 
M + . For M + C Mg M*, we have 

x M+X 

= {Definition 14} 

(x,X) e U {M m | m > 0} 

= {Logic} 

(3m:m>0:i M m X) 

= {Taking n = m — 1; composition} 

(3 n : n > 0 : x(M § M n ) X) 

= {Definition 10} 

(3 n : n > 0 : (3 7 :: x M Y /\ (V y : y £ Y : y M n X))) 

= {Logic} 

(3 Y :: x M Y A (3 n : n > 0 : (Vt/ : y £ Y : y M n X))) 

^ {Logic} 

(3 7 :: x M Y A (V y : y G Y : (3n : n >0 : y M n A"))) 

= {Logic; Definition 14} 

(3 Y :: x M Y A (V y : y € Y : y M* X)) 

= {Definition 10} 
x (M § M*) X 

□ 

Proof of Lemma 20 The proof is by induction on n. 

Base case. By the definition of round , the result holds if n = 1. 



Inductive step. Suppose that 

x round n X {y,j/ + l}CIV{j-l,j/}CI 
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where y = x — 3n. 

x round n+1 X 

{Definition of round n+1 } 
x ( round n g round) X 
<t=> {Lemma 11} 

(3 Z : Z £ sp(x, round 71 ) : (V z : z G Z : z round A')) 

{Inductive hypothesis} 

(3Z : Z £ {{ y , y + 1}, {y - 1, y}} : {V z : z £ Z : z round X)) 

^ {logic} 

((y — 1) round X A y round A) V (y round X A (y + 1) round A) 
{Distributive law, commutativity} 
y round X A {{y — 1) round X V (y + 1) round X) 

{Definition of round} 

({j-2, r 3}CIV{ r 3, r 4}CI)A 
{{y-l,y-2}CXV{y-2,y-3}CX V 
{ r 3,j-4}CIV{ r 4, r 5}CI) 

{Absorption} 

({</ - 2, y — 3} C A V {y - 3, y — 4} C X) 

{Let w = x — 3 (n + 1)} 

({w, ui + 1} C X V {w — 1, w} C A r ) 

□ 
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Abstract. We propose a relaxation of Kleene algebra by giving up 
strictness and right-distributivity of composition. This allows the sub- 
sumption of Dijkstra’s computation calculus, Cohen’s omega algebra and 
von Wright’s demonic refinement algebra. Moreover, by adding domain 
and codomain operators we can also incorporate modal operators. Fi- 
nally, it is shown that predicate transformers form lazy Kleene algebras 
again, the disjunctive and conjunctive ones even lazy Kleene algebras 
with an omega operation. 



1 Introduction 

Kleene algebra (KA) provides a convenient and powerful algebraic axiomati- 
zation of the basic control constructs composition, choice and iteration. In its 
standard version, composition is required to distribute over choice in both argu- 
ments; also, 0 is required to be both a left and right annihilator. Algebraically 
this is captured by the notion of an idempotent semiring or briefly I-semiring. 

Models include formal languages under concatenation, relations under stan- 
dard composition and sets of graph paths under path concatenation. 

The idempotent semiring addition induces a partial order that can be thought 
of as the approximation order or as (angelic) refinement. Addition then coincides 
with the binary supremum operator, i.e., every semiring is also an upper semilat- 
tice. Moreover, 0 is the least element and thus plays the role of _L in denotational 
semantics. 

If the semilattice is even a complete lattice, the least and greatest fixpoint 
operators allow definitions of the finite and infinite iteration operators * and u , 
resp. However, to be less restrictive, we do not assume completeness and rather 
add, as is customary, * and “ as operators of their own with particular axioms. 

The requirement that 0 be an annihilator on both sides of composition makes 
the algebra strict. This prohibits a natural treatment of lazy computation sys- 
tems in which e.g. infinite sequences of states may occur. Therefore we study 
a “one-sided” variant of KAs in which composition is strict in one argument 
only. This treatment fits well with systems such as the calculus of finite and 
infinite streams which is also used in J. Lukkien’s operational semantics for the 
guarded command language [15, 16] or R. Dijkstra’s computation calculus [8, 
9]. Inspired by the latter papers, we obtain a very handy algebraic character- 
ization of finite and infinite elements that also appears already in early work 
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on so-called quemirings by Elgot [10]. In addition, we integrate the theory with 
Cohen’s cu-algebra [4] and von Wright’s demonic refinement algebra [21,22]. 

There is some choice in what to postulate for the right argument of compo- 
sition. Whereas the above-mentioned authors stipulate binary or even general 
positive disjunctivity, we investigate how far one gets if only isotonicity is re- 
quired. This allows general isotone predicate transformers as models. 

Fortunately, our lazy KAs are still powerful enough to admit the incorpo- 
ration of domain and codomain operators and hence an algebraic treatment of 
modal logic. Of course, the possibility of nontrivial infinite computations leads 
to additional terms in the corresponding assertion logic; these terms disappear 
when only finite elements are considered. 

Altogether, we obtain a quite lean framework that unites assertion logic with 
algebraic reasoning while admitting infinite computations. The axiomatization 
is simpler and more general than that of von Karger’s sequential calculus [11]. 

2 Left Semirings 

Definition 2.1 A left (or lazy) semiring , briefly an L-semiring , is a quintuple 
( AT , +, 0, -, 1) with the following properties: 

1. (A", +,0) is a commutative monoid. 

2. (A', •, 1) is a monoid. 

3. The • operation distributes over + in its left argument and is left-strict: 

( a + b)-c=a-c + b- c , 0 • a = 0 . 

Definition 2.2 An idempotent left semiring, or briefly IL-semiring is an L- 
semiring (A', +, 0, •, 1) with idempotent addition in which • is riglrt-isotone: 

a + a = a A ( b<c=^a-b<a-c ), 
where the natural order < on K is given bya<6^S> a + 6 = b. 

Note that left-isotonicity of • follows from its left-distributivity. Moreover, 0 is 
the least element w.r.t. the natural order. The left semiring structure without the 
requirement of riglrt-isotonicity is also at the core of process algebra frameworks 
(see e.g. [3]) where S (inaction) plays the role of 0. Since, however, we will make 
essential use of riglrt-isotonicity, only few of our results will carry over to that 
setting. 

By isotonicity, • is universally super disjunctive and universally subconjunc- 
tive in both arguments; we state these properties for the right argument: 

a ■ (U L) > U {a ■ l : l £ L} a ■ (l~1 L) < IN {a ■ l : l £ L} . 

Analogous properties hold for the left argument. 

From this we can conclude a weak form of right distributivity for the left 
hand side of inequations: 
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Lemma 2.3 For a, b,c,d£ K we have 

b + c<d^a-b + a- c<a-d. (1) 

Proof. By isotonicity and super disjunctivity we get 

b + c<d=> a - (b+c)<a-d=>a-b+a-c<a-d. □ 

Definition 2.4 1. A function between partial orders is called universally dis- 
junctive if it preserves all existing suprema. A binary operation is called 
universally left- (right-) disjunctive if it is universally disjunctive in its left 
(right) argument. 

2. An IL-semiring ( K , +, 0, •, 1) is bounded if K has a greatest element T w.r.t. 
the natural order. It is complete if the semilattice ( K , <) is a complete lattice 
and • is universally left-disjunctive. 

3. Finally, K is Boolean if (K,<) is a Boolean algebra , i.e., a complemented 
distributive lattice. Every Boolean IL-semiring is bounded. 

Now we look at the composition from the other end. 

Definition 2.5 For a binary operation • : K x K — » K we define its mirror 
operation" : K x K — > Kbyx"y = y ■ x. We call (AT, +, 0, •, 1) an (idempotent) 
right semiring (briefly (I)R-semiring) if (A', +, 0,“, 1) is an (I)L-semiring. The 
notions of a complete and Boolean (I)R-semiring are defined analogously. If 
K is both an (I)L-semiring and an (I)R-semiring it is called an (I-) semiring. 
The notions of a complete and Boolean (I-)semiring are defined analogously. A 
complete I-semiring is also called a standard Kleene algebra [5] or a quantale [19]. 

Note, however, that in (I-)semirings composition is also right-strict; hence 
these structures are not very interesting if one wants to model lazy computation 
systems. Prominent I-semirings are the algebra of binary relations under rela- 
tional composition and the algebra of formal languages under concatenation or 
join (fusion product). 

3 Particular IL-Semirings 

We now introduce our two main models of the notion of IL-semiring. Both of 
them are based on finite and infinite strings over an alphabet A. Next to their 
classical interpretation as characters, the elements of A may e.g. be thought of 
as states in a computation system, or, in connection with graph algorithms, as 
graph nodes. Then, as usual, A* is the set of all finite words over A; the empty 
word is denoted by e. Moreover, AF is the set of all infinite words over A. We 

def 

set A°° = A* U A u . The length of word s is denoted by | s \ . By • we denote 
concatenation, where s • f = f sif|s| = oo. A language over A is a subset of A°°. 
As usual, we identify a singleton language with its only element. For language 
S C A°° we define its infinite and finite parts by 




Lazy Kleene Algebra 255 



Definition 3.1 The algebra WOR = (^(A 00 ), U, 0, •, e) is obtained by extend- 
ing • to languages in the following way: 

S*T d = infs' U {s»t:s€finS A t £ T} . 

Note that in general S • T ^ {s • t : s € S A t € T}; using the set on 
the right hand side as the definition of S • T one would obtain a right-strict 
operation. With the definition given, we have S • 0 = inf S and hence S • 0 = 0 
iff infS = 0. It is straightforward to show that WOR is an IL-semiring. The 
algebra is well-known from the classical theory of w-languages (see e.g. [20] for 
a recent survey). 

Next to this model we will use a second one that has a more refined view of 
composition and hence allows more interesting modal operators. 

Definition 3.2 For words s, t € A°° we define their join or fusion product s ixi t, 
as a language-valued operation: 

. . , def ( S if Isl = 00 , 

( init(s) • (last(s) fl head {f)) • tail(t ) otherwise , 

where head(e) = f tail(e) = f init{e) *= f last(e ) = f e, viewed as a singleton 
language. 

The definition entails e N e = e and s ixi t = 0 when last(s) ^ head(t), i.e. , 
a non-empty finite word s can be joined with a non-empty word t iff the last 
letter of s coincides with the first one of f; only one copy of that letter is kept 
in the joined word. Since we view the infinite words as streams of computations, 
we call the model based on this composition operation STR. 

Definition 3.3 The algebra STR = f U, 0, IX, AUe) is given by extend- 

ing IX to languages in the following way: 

S X T = f infS* U {sMf :s£ finS 1 A t£T}. 

Analogously to above, we have S IX 0 = infS and hence S IX 0 = 0 iff 
inf S = 0. It is straightforward to show that STR is an IL-semiring. Its subalgebra 
(V(A°° — e),U,0,IX,A) of nonempty words is at the heart of the papers by 
Lukkien [15,16] and Dijkstra [8,9]. 

Both WOR and STR are even Boolean IL-semirings. Further IL-semirings 
are provided by predicate transformer algebras (see below). 

4 Terminating and Non-terminating Elements 

As stated, we want to model computation systems in such a way that the op- 
erator • represents sequential composition and 0 stands for the totally useless 
system abort which does not make any progress and hence may also be viewed 
as never terminating. 




256 



Bernhard Moller 



As we are interested in treating finite and infinite computations uniformly, 
we need to characterize these notions algebraically. This will be achieved using 
the above properties of the finite and infinite parts of a language. 

Operationally, an infinite, non-terminating computation a cannot be followed 
by any further computation. Algebraically this means that composing a with 
any other element on the “infinite side” has no effect, i.e. , just a again results. 
We write temporal succession from left to right, i.e., a ■ b means “first perform 
computation a and then 6” . Therefore we give the following 

Definition 4.1 Consider an IL-semiring (K, +, 0, -, 1). An element a £ K is 
called non-terminating or infinite if it is a left zero w.r.t. composition, i.e., if 

V b € K : a - b = a . 

The set of all non-terminating elements is denoted by N. 

From the left-strictness of • we immediately get 0 € N. Moreover, we have 
the following characterization of non-terminating elements: 

Lemma 4.2 o£ N O a'0 = a. 

Proof. (=>) Choose b = 0 in the definition of N . 

(*t=) Using the assumption, associativity, left strictness and the assumption 
again, we calculate a-b=a-0-b = a- 0 = a. □ 

By this characterization N coincides with the set of fixpoints of the isotone 
function Az . z ■ 0. Hence, if K is even a complete lattice, by Tarski’s fixpoint 
theorem N is a complete lattice again. 

Next we state two closure properties of N. 

Lemma 4.3 Denote by ■ also the pointwise extension of ■ to subsets of K. 

1. An arbitrary computation followed by a non-terminating one is non-termi- 
nating, i.e., K • N C N (and hence K ■ N = Nj. 

2 . If ■ is universally left- disjunctive then N is closed under U . 

Proof. 1. Consider a € K and b € N. Then (a • b) ■ 0 = a ■ (b ■ 0) = a ■ b. The 
inclusion N C K ■ N follows by 1 £ K. 

2. Consider L C N such that U L exists. Then, by the assumptions, (U L) ■ 0 = 
U (L • 0) = U L. ' □ 

So the supremum in N coincides with the one in the overall algebra K. 

Now we relate the notions of right-strictness and termination. 

Lemma 4.4 The following properties are equivalent: 

1. The ■ operation is right- strict. 

2 . | N | = 1. 

3. T • 0 = 0 (provided K is bounded). 
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Proof. (1 =>• 2) It follows that N = {0}. 

(2 => 3) Since 0 € N and T • 0 € N we get T -0 = 0. 

(3 => 1) For arbitrary a £ K we have, by isotonicity, a • 0 < T • 0 = 0. □ 

Next we show 

Lemma 4.5 1. N = {a • 0 : a G K}. 

2. b ■ 0 is the greatest element of N(6) = {a G N : a <b}. 

3. If K is bounded then T • 0 is the greatest element of N . In particular, T ■ 0 = 

UN. 

f. If N is downward closed and T G N then 1 = 0 and hence \K\ = 1. 

Proof. 1. ( C ) Immediate from the definition of N. 

( U ) Assume z = a ■ 0. Then 2-0 = a- 0- 0 = a- 0 = ; 2 . 

2. First, assume a G N A a < b. Then by right-isotonicity of • we have a = 
a • 0 < b ■ 0. So b ■ 0 is an upper bound of N(6). 

Second, by 1. we have b ■ 0 £ N. By right-neutrality of 1 and isotonicity we 
get b ■ 0 < b ■ 1 = b, i.e. , b ■ 0 € N(6), which shows the claim. 

3. Immediate from 2. 

4. By downward closure, 1 £ N, hence 1 = 1 • 0 = 0 by neutrality of 1. □ 

Property 3 of this lemma says that T • 0 is an adequate algebraic representa- 
tion of the collection of all non-terminating elements of a bounded IL-semiring. 
This is used extensively in [8,9], where T • 0 is called the eternal part of K. 
However, we want to manage without the assumption of completeness or bound- 
edness and therefore prefer to work with the set N rather than with its greatest 
element. 

By Property 3 we may call b ■ 0 the non-terminating or infinite part of b. This 
leads to the following 

Definition 4.6 We call an element a finite if its infinite part is trivial, i.e., if 
a ■ 0 = 0. The set of all finite elements is denoted by F. By this definition 0 £ F. 
To mirror our operational understanding we call an element a terminating if a 
is finite and a / 0. We set T l = F — {0}. 

A number of properties of F and T are collected in 

Lemma 4.7 1. F is downward closed. 

2. 1 G F. If 1 yf 0 then 1 £ T ("skip is terminating) . 

3. For non-empty S C K we have SC F <t=> S ■ {0} = {0}. 

4. K ■ F = K = F • K. 

5. F + F C F and T + T C T (finite and terminating computations are closed 
under choice). Since + is idempotent. we have even equality in both cases. If 
■ is universally left- disjunctive then F is closed under arbitrary joins and T 
under non-empty ones. 

6. F • F C F (finite computations are closed under composition) . By neutrality 
of 1 we have even equality. T need not be closed under composition. 
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Proof. 1. Immediate from isotonicity. 

2. Immediate from left-neutrality of 1. 

3. Immediate from the definition of F. 

4. By left-neutrality of 1 we get A' = 1-AT C F- K. Similarly, by right-neutrality 
K C K • F. The reverse inclusions are trivial. 

5. Immediate from distributivity/disjunctivity. 

6. By 2. we have F • F • {0} = F • {0} = {0}, and 2. again shows the claim. □ 

Notation. Although we do not assume a general meet operation n, we will 
sometimes use the formula y n 2 = 0; it is an abbreviation for \/u.u<yAu< 
z => u = 0. 

With the help of this, we can describe the interaction between F and N. 

Lemma 4.8 1. l\lnF = {0}. 

2. If N is downward closed, then for x £ N and y £ F we have x n y = 0. 

3. Assume ieN A y £ F. Then x + y£N^Ay<x. Hence if N is downward 
closed, x + y £ N <£=> y = 0. 

Proof. 1. If x £ N n F then x = x ■ 0 = 0. 

2. Suppose z < x A z < y for some z £ I\. Then the assumption and 
Lemma 4.7.1 imply 2 £ N n F, hence 2 = 0 by 1. 

3. First we note that, by the assumption, 

(x + y)-0 = x- 0 + y- 0 = x + 0 = x. (*) 

(=>) If (x + y) • 0 = x + y then by (*) x = x + y, i.e., y < x. 

(4=) If y < x then x = x + y and hence x + y = x = (x + y) ■ 0 by (*). □ 

5 Separated IL-Semirings 

5.1 Motivation 

Although our definitions of finite and nonterminating elements have led to quite 
a number of useful properties, we are not fully satisfied, since the axiomatization 
does not lead to full symmetry of the two notions, whereas in actual computation 
systems they behave much more symmetrically. Moreover, a number of other 
desirable properties do not follow from the current axiomatization either. We 
list the desiderata: 

— While inf a = f a ■ 0 gives us the nonterminating part of a, we have no corre- 
sponding operator fin that yields the finite part of a. Next, inf is disjunctive; 
by symmetry we would expect that for fin as well. 

— The set F of finite elements is downward closed, whereas we cannot guar- 
antee that for the set N of nonterminating elements. However, since a < b 
means that a has at most as many choices as b, one would expect a to 
be nonterminating if b is: removing choices between infinite computations 
should not produce finite computations. Then, except for 0, the finite and 
nonterminating elements would lie completely separately. 

— Every element should be decomposable into its finite and nonterminating 
part. 
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The task is now to achieve this without using a too strong restriction on the 
semiring (such as requiring it to be a distributive or even a Boolean lattice). 

5.2 Kernel Operations 

To prepare the treatment, we first state a few properties of kernel operations 
that will be useful both for partitioning functions and in connection with tests 
in the next section. 

Definition 5.1 A kernel operation is an isotone, contractive and idempotent 
function f : K — > K from some partial order ( K , <) into itself. The latter two 
properties spell out to f(x) < x and /(/( x)) = f(x) for all x G K. 

Example 5.2 It is straightforward to see that multiplication by an idempotent 
element and hence, in particular inf, is a kernel operation. □ 

It is well-known that the image f(K) of a kernel operation / consists exactly 
of the fixpoints of /. 

Lemma 5.3 Let f : K — > I\ be a kernel operation. 

1. f{x) = U {y £ f(K) :y <x}. 

2. If K has a least element 0 then /( 0) = 0. 

3. If K is an upper semilattice with join operation + then /(/( x) + f(y)) = 
f(x) + f(y), i.e., f(K) is closed under +. 

Proof. 1. By isotonicity and the above fixpoint property, f{x) is an upper 
bound of S d = {y G f(K) : y < x}. But f(x) G S, since f(x) < x, 
and so f(x) is the supremum of S. 

2. Immediate from contractivity of /. 

3. (<) follows by contractivity of /. 

(>) By isotonicity and idempotence of /, 

+ f(y)) > fif(x)) + fifty)) = fix) + fty) . 

□ 

Lemma 5.4 For a kernel operation f : K — * K the following two statements 
are equivalent: 

1. ftK) is downward closed. 

2. For all a.b G K such that a\lb exists, also fia) n b and f(a) n fib) exist and 
f(anb) = fia) n b = f)a) n f(b) . 

Proof. First we show that the first equation in 2. implies the second one. Assume 
fia n b) = fia) n b for all a, b such that a n b exists. Then by idempotence of / 
we get, using this assumption twice, 



/(an b) = /(/(an b)) = ftfta)nb) = /(a)n/(6) . 
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(1. => 2.) By isotonicity and contractivity of / we have /(a n b) < f(b) < b and 
f(a\lb) < /(a). Consider now an arbitrary lower bound c for /(a) and b. Then by 
downward closure of f(K) also c £ f(K), i.e. , c = /(c). Moreover, c < f(a) < a 
by contractivity of /. Therefore c < a n b and hence c = /(c) < f(a\lb) by 
isotonicity of /. 

(2. => 1.) Consider an a £ f(K) and b < a, i.e., b = aF\b. Then by assumption 
f(b) = /(a n b) = /(a) nb = aF\b = b and hence b £ /(A') as well. □ 

Corollary 5.5 Suppose that f : K — > K is a kernel operation and f{K) is 
downward closed. 

1. If a,b £ K with b < a then f(b) = /(a) n b. 

2. If K is bounded then /(a) = a n /( T) for all a £ K. 

5.3 Partitions 

We now study the decomposition of elements into well-separated parts. For this, 
we assume a partial order ( K , <) that is an upper semilattice with join operation 
+ and a least element 0. 

Definition 5.6 Consider a pair of isotone functions /i , fi '■ K — ■> K . Let / 

~ dpf ~ dpf — 

range over f\, f 2 and set /i = f 2 , f 2 = fi- Note that / = /. The pair is said 
to weakly partition K if for all a £ A" we have 

/(a)+/(a)=a, (WP1) /(/(a)) = 0 . (WP2) 

Of course, the concept could easily be generalized to systems consisting of 
more than two functions. Let us prove a few useful consequences of this definition. 
Note that by our notational convention also /(/(a)) = 0. 

Lemma 5.7 Let f and f weakly partition K. 

1. f is a kernel operation. 

2. x £ f(K) <=> x = f(x) <t=> f(x) = 0. 

3. The image set f{K) is downward closed. 

4. f(K) n f(K) = {0}. _ 

5. For y £ f(K) and z £ f(K) we have y\~\z = 0. In particular, f(x)\lf(x) = 0 
for all x € K . 

Proof. 1. By assumption / is isotone. Moreover, by (WP1) we have /( x) < x. 
Idempotence is shown, using (WP1) and (WP2), by 

f{x) = f(f(x)) + f(f(x)) = f(f(x)) + 0 = f(f(x)) . 

2. The first equivalence holds, since by 1. / is a kernel operation. For the second 
one we calculate, using (WP1) and (WP2), 

x = f{x) => f{x) = fifix)) = 0 => x = fix) + fix) = fix) . 
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3. Assume z < f(y) for some y G K. By isotonicity of / then /(z) < f(f{y)) = 
0 and hence, again by 2., also z € f{K). 

4. Assume x G f(K) n f(K). By 2. then x = f(x) and f(x) = 0 which shows 
the claim. 

5. For a lower bound z of x G f(K) and y G f(K) we get by 3. and 4. that 

Z g f(K) n f(K) = {0}. ' ' □ 

The last property means that the fi decompose every element into two parts 
that have only a trivial overlap; in other words /i(a) and / 2 (a) have to be relative 
pseudocomplements of each other. 

Although weak partitions already enjoy quite a number of useful properties, 
they do not guarantee uniqueness of the decomposition. Hence we need the 
following stronger notion. 

Definition 5.8 A pair of functions / 1 , f '2 '■ K — > K is said to strongly partition 
K if they weakly partition K and are additive, i.e., satisfy fi(a+b) = fi(a)+fi(b). 

Lemma 5.9 Let /1 , f '2 ■ K — > K strongly partition K. 

1. f(f(a) + b) = f(b), i.e., f -parts of elements are ignored by f. 

2. f is uniquely determined by /, i.e. 

a = x + f(a) A x G f{K) => x = /(a) . 

Proof. 1. By additivity and (WP2), 

/(/>) +b) = f(f(a)) + m = 0 + m = m. 

2. By the assumption and 1. we get /(a) = f(x + /(a)) = f(x) = x. □ 

Property 2. is equivalent to additivity in this context: applying(WPl) twice, 
then 1. twice and then Lemma 5.3.3, we obtain 

/(a + b) = /(/(a) + /(a) + f(b) + f{b)) = 

/(/(a) + /(&)) =/(«)+/(&) . 

5.4 Separating Finite and Infinite Elements 

Definition 5.10 An IL-semiring K is called separated if, in addition to the 
function inf : K — > K defined by inf 2 ; = f x-t), there is a function fin : K — > K 
that together with inf strongly partitions K and satisfies fin A' = F. 

Example 5.11 In [10] the related notion of q uemiring is studied, although no 
motivation in terms of finite and infinite elements is given. A quemiring is ax- 
iomatizecl as a left semiring in which each element a has a unique decomposition 
a = al + a • 0 such that distributes over + and multiplication by an image 
under is also right-distributive. So corresponds to our fin -operator. However, 
the calculation 

a ■ (b+ c) = (a^[ + a ■ 0) • (b + c) = a^[ • (6 + c) + a ■ 0 • (b + c) = 
a^f-6 + a^f-c + a- 0 = a^[-6+a^[-c+a-0-6 + a- 0- c = 

(a^[ + a ■ 0) • b + (a^[ + a • 0) • c = a ■ b + a ■ c 
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shows that a quemiring actually is a semiring and hence not too interesting from 
the perspective of the present paper. □ 

Example 5.12 Every Boolean IL-semiring K (and hence, in particular, WOR 
and STR) is separated. To see this, we first observe that for arbitrary b € K the 
functions 

fi(x) = f x n b , f 2 (x) = f x n b , 

strongly partition K , as is easily checked. In particular, by Lemma 5.7 they are 
kernel operations and hence satisfy fi(x) = x n /;(T) by Corollary 5.5.2. 

Choosing now b = T • 0 we obtain inf a; = x n T • 0. Therefore we define 

dcf - - 

fin x — inT -0. Then fin K = F follows from Lemma 5.7 and x € F <£=> inf x = 

0. 

It follows that for Boolean I\ we have 



ifN i<T'0, ifF « i<T'0 . 

This was used extensively in [8,9]. 

For Boolean K we have also 



inf T = inf (1 + 1) = inf 1 + inf 1 = inf 1 . 



□ 

Example 5.13 Now we give an example of an IL-semiring that is not separable. 
The carrier set is K = {0, 1, 2} with natural ordering 0 < 1 < 2. Composition is 
given by the equations 



0 • x = 0 , 1 • x = x , 2 ■ x = 2 . 

Then N = {0,2} and F = {0, 1}, so that N is not downward closed as it would 
need to be by Lemma 5.7 if K were (weakly) separable. □ 

In the presence of a left residual we can give a closed definition of fin . 

Lemma 5.14 Assume an IL-semiring K with a left residuation operation / 
satisfying the Galois connection 

y < x/z <t=> y ■ z < x . 

If K is separated then fin x = x n 0/0. 

Proof. By separation and Lemma 5.3.1, fin x = U {y S F : y < x}. Therefore, by 
downward closure of F 

y<i\nx<^yGFAy<x<^y-0<0Ay<x^y< 0/0 A y < x . 

Now the claim follows by the universal characterization of meet. □ 
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We conclude this section by listing a few properties concerning the behaviour 
of inf and fin w.r.t. composition. 

Lemma 5.15 Assume a separated IL-semiring K. 

1. a ■ b = inf a + fin a ■ b. 

2. inf (a ■ b) = inf a + fin a ■ inf b. 

3. fin (a • b) = fin (fin a ■ b) > fin a ■ fin b. If K is right-distributive, the latter 

inequation can be strengthened to an equality. 

Proof. 1. a ■ b = (inf a + fin a) ■ b = inf a ■ b + fin a ■ b = inf a + fin a ■ b. 

2. inf(a-6) = a-b-0 = a-inf b = (inf a + fin a) • inf b = inf a- inf 6 + fin a- inf b = 

inf a + fin a • inf b. 

3. By 1. and isotonicity, 

fin (a ■ b) = fin (inf a + fin a ■ b) = fin (fin a ■ b) = fin (fin a ■ (inf b + fin b)) > 
fin (fin a ■ inf b) + fin (fin a ■ fin b) = fin a ■ fin b . 

If K is right-distributive, the fourth step and hence the whole calculation 
can be strengthened to equalities. □ 

6 Iteration Lazy Kleene Algebras 

The central operation that moves a semiring to a Kleene algebra (KA) [5] is 
the star that models arbitrary but finite iteration. Fortunately, we can re-use 
the conventional definition [12] for our setting of IL-semirings. In connection 
with laziness, the second essential operation is the infinite iteration of an el- 
ement. This has been studied intensively in the theory of w-languages [20]. A 
recent algebraic account is provided by Cohen’s w-algebras [4] and von Wright’s 
demonic refinement algebra [21,22]. However, both assume right-distributivity, 
Cohen even right-strictness of composition. While safety analysis of infinite com- 
putations is also possible using star only [14], omega iteration serves to describe 
liveness aspects (see e.g. [17]). 

Definition 6.1 A left or lazy Kleene algebra (LKA) is a structure ( K * ) such 
that K is an IL-semiring and the star * satisfies, for a,b,c € K , the unfold and 
induction laws 

1 + a ■ a* < a* , (2) b + a ■ c < c => a* ■ b < c . (3) 

An LKA is strong if it also satisfies the symmetrical star induction law 

b + c- a<c=>b - a* < c . (4) 

Therefore, a* is the least pre-fixpoint and the least fixpoint of the function 
A x . a ■ x + b. Star is isotone with respect to the natural ordering. Even the weak 
star axioms suffice to prove the following laws: 

a* ■ a* = a* , (idempotence) 

(a + b)* = a* ■ (a • b*)* , (decomposition) 

a ■ c < c - b => a* ■ c < c - b* . (semicommutation) 
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In a strong LKA the star also satisfies the symmetrical star unfold axiom 

1 + a* ■ a < a* (5) 

and hence is the least pre-fixpoint and least fixpoint of the function Xx . x ■ a + b. 
Next we note the behaviour of finite elements under the star: 

Lemma 6.2 a € F => a* € F. 

Proof. By neutrality of 0 we get a • 0 < 0 <t=> a • 0 + 0 < 0, so that star induction 
(3) shows a* ■ 0 < 0. □ 

We now turn to infinite iteration. 

Definition 6.3 An uj-LKA is a structure (K, u ) consisting of an LKA K and a 
unary omega operation “ that satisfies, for a, b, c £ K, the unfold and coinduction 
laws 

aP = a- a" , (6) 

c<a-c+b=>c<a u + a* -b . (7) 

One may wonder why we did not formulate omega unfold as a“ < a • a“. 
The reason is that in absence of right-strictness we cannot show the reverse 
inequation. By the coinduction law, the greatest (post-) fixpoint of Ax . a • x is 
aP + a* ■ 0 and a* • 0 need not vanish in the non-strict setting. This may seem 
paradoxical now. But by star induction we can easily show a* ■ 0 < aP using 
a ■ a" < aP , so that indeed a w coincides with the greatest (post-) fixpoint of 
A x .a ■ x. The inequation a* ■ 0 < dP seems natural, since by an easy induction 
one can show a 1 ■ 0 < aP for all* € IN anyway. 

For ease of comparison we note that von Wright’s aP corresponds to a* + 
in our setting. 

Some consequences of the axioms are the following. 

Lemma 6.4 Consider an lo-LKA K and an element a € I\ . 

—r— (d.Gf 

1. K has a greatest element T = 1“. 

2. Omega is isotone with respect to the natural ordering. 

3. a* -a“ = oP . 

4- is a right ideal, i.e., <P ■ ' = ' . 

Proof. 1. This follows from neutrality of 1 and omega coinduction (7). 

2. Immediate from isotonicity of the fixed point operators. 

3. The inequation a* ■ aP < PP is immediate from the star induction law (3). 
The reverse inequation follows from 1 < a* and isotonicity. 

4. First, by the fixpoint property of a“ we get a“ • T = a • a" • T. Hence 

a u • T < aP . The reverse inequation is immediate from neutrality of 1 and 
isotonicity. □ 
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We note that in a separated w-LKA the set F has the greatest element fin T; 
this element is sometimes termed “havoc”, since it represents the most nonde- 
terministic but always terminating program. 

Further laws together with applications to termination analysis can be found 
in [7]. We conclude this section with some decomposition properties for star and 
omega. 

Lemma 6.5 Assume a separated lu-LKA K . 

1. a* = (fin a)* ■ (1 + inf a). 

2. inf a* = (fin a)* ■ inf a. 

3. a ■ (fin a)* • inf a = (fin a)* • inf a. 

4- a" = (fin a)* ■ inf a + (fin a) u . 

Proof. 

1. a* = (fin a + inf a)* = (fin a)* • (inf a • (fin a)*)* = 

(fin a)* ■ (inf a)* = (fin a)* • (1 + inf a • (inf a)*) = (fin a)* ■ (1 + inf a) . 

2. Using 1. we get 

a* ■ 0 = (fin a)* ■ (1 + inf a) • 0 = 

(fin a)* • (1 • 0 + inf a • 0) = (fin a)* - inf a . 

3. a ■ (fin a)* • inf a = (fin a + inf a) • (fin a)* • inf a = 

fin a ■ (fin a)* • inf a + inf a • (fin a)* • inf a = fin a ■ (fin a)* ■ inf a + inf a = 

(fin a • (fin a)* + 1) ■ inf a = (fin a)* - inf a . 

4. The inequation > holds by isotonicity of omega, by 3 and omega coinduction. 
The reverse inequation reduces by omega unfold to 

a“ < (fin a) • a“ + inf a a" < (fin a) ■ a“ + (inf a) • 

a“ < (fin a + inf a) • aP aP < a ■ a w <=> TRUE . □ 

7 Tests, Domain and Codomain 

Definition 7.1 1. A left test semiring is a two-sorted structure (AT, test(A')), 

where K is an IL-semiring and test(A") C K is a Boolean algebra embedded 
into I\, such that the join and meet operations of test(I \ ) coincide with the 
restrictions of + and • of K to test (A"), respectively, and such that 0 and 1 
are the least and greatest elements of test(A'). In particular, p < 1 for all 
p £ test(K). But in general, test(A") is only a subalgebra of the subalgebra 
of all elements below 1 in AT. The symbol -> denotes complementation in 
test(A'). 

2. A lazy Kleene algebra with tests is a left test semiring (AT, B) such that K is 
a lazy KA. 

This definition generalizes the one in [13]. We will consistently use the letters 
a, b, c . . . for semiring elements and p,q,r, . . . for Boolean elements. We will also 
use relative complement p — q = p • ~^q and implication p — * q = —>p + q with 
their standard laws. For all p £ test(A") we have that p* = 1 and p u = p ■ T. 

If the overall IL-semiring K is Boolean, one can always choose test(A') = 

[p | p < 1} as the set of tests and define = f pU 1, where a is the complement 
of element a in the overall algebra. Note that by Lemma 4.7.1 all tests are finite. 
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Lemma 7.2 Assume a left, test semiring K. Then the following hold for all 
a,b,c £ I\ and all p,q £ test(A'). 

1. If a n b exists then p ■ (a n b) = p- ar\b = p- a\lp-b. 

2. (jp n q) ■ a = p ■ a n q ■ a. 

3. pF\ q = 0 => p ■ a n q ■ a = 0. 

4- If b < a then p ■ b = b n p ■ a. 

In particular, if K is bounded then p • b = b n p ■ T. 

Proof. We first note that for any test p £ test(A") the function f p (a) = f p-a is a 
kernel operation by p < 1, isotonicity of • in both arguments and multiplicative 
idempotence of tests. Next we want to show that / p (AT) is downward closed. 
Suppose b < p ■ a. Then by isotonicity, ~^p ■ b < ~>p ■ p ■ a = 0 and hence 

b = 1 • b = (p + -i p) - b = p- b + ^p-b = p- b , 

i.e., b = f p (b) £ f p (K), too. 

Now the claims other than 2. follow immediately from Lemma 5.4 and Corol- 
lary 5.5. For 2. set b = a and use 1 twice together with p n q = p ■ q. □ 

Let now semiring element a describe an action or abstract program and a 
test p a proposition or assertion on its states. Then p ■ a describes a restricted 
program that acts like a when the initial state satisfies p and aborts otherwise. 
Symmetrically, a ■ p describes a restriction of a in its possible final states. 

To show the interplay of tests with infinite iteration we prove a simple in- 
variance property: 

Lemma 7.3 p-a = p- a- p=^p-a ul = (p ■ a) u . This means that an invariant 
of a will hold throughout the infinite iteration of a. 

Proof. (>) We do not even need the assumption: 

( p • a) w = p ■ a ■ (p ■ a = p ■ p ■ a ■ (p ■ a = 
p- (p ■ a)“ <p-a u . 

(<) By the fixpoint property of omega and the assumption, 

UJ CO 

p • a = p • a - a = p • a • p • a , 

which means that p ■ a u is a fixpoint of Ax . p ■ a ■ x and hence below its greatest 
fixpoint (p ■ a) u . □ 

We now introduce an abstract domain operator r that assigns to a the test 
that describes precisely its starting states. 

Definition 7.4 A semiring with domain [6] (a r -semiring) is a structure ( K , r ), 
where K is an idempotent semiring and the domain operation r : K — > test(A") 
satisfies for all a,b £ K and p £ test(A') 

a < r a ■ a , (dl) r (p ■ a) < p , (d2) r (a-T>) < r {a-b) . (d3) 

If K is an LKA, we speak of an LKA with domain, briefly r -LKA. 
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These axioms can be understood as follows, (dl), which by isotonicity can 
be strengthened to an equality, means that restriction to all all starting states 
is no actual restriction, whereas (d2) means that after restriction the remaining 
starting states should satisfy the restricting test. (d3) states that the domain of 
a ■ b is not determined by the inner structure or the final states of b ; information 
about T in interaction with a suffices. 

To further explain (dl) and (d2) we note that their conjunction is equivalent 
to each of 

r a<p<^a<p-a, (lip) r a < p <t=> ~^p ■ a < 0 . (gla) 

(lip) says that r a is the least left preserver of a. (gla) says that -> r a is the greatest 
left annihilator of a. By Boolean algebra (gla) is equivalent to 



p- r a<0<t=>p-a<0. 



Because of (lip), domain is uniquely characterized by the axioms. Moreover, if 
test (K) is complete then domain always exists. If test(/\) is not complete, this 
need not be the case. 

Although the axioms are the same as in [6], one has to check whether their 
consequences in KA can still be proved in LKA. Fortunately, this is the case. 
Right-distributivity was used in [6] only for the proofs of additivity and the 
import/export law r (pa) = p r a. But the latter follows from (d3) and stability 
r p = P (which, in turn, follows from (lip) and idempotence of tests). Additivity 
is a special case of 

Lemma 7.5 Domain is universally disjunctive. In particular, r 0 = 0. 

The proof has been given in [18]; it only uses (lip) and isotonicity of domain. 
But the latter follows easily from (gla). 

From (dl) and left strictness of composition we also get 

r a = 0 =» a = 0 . (8) 

Two other useful properties are 

Lemma 7.6 1. r (a • b) < r a. 

2. If K is bounded then r (a • T) = r a. 

Proof. 1. Using (lip) we get 

r a<p<t=>a<p-a=> a • b < p ■ a ■ b r (a • b) < p , 
and the claim follows by indirect inequality, i.e., by 

x<y4$\/z.y<z => x < z . 

2. The inequation < follows from 1., whereas > follows from 1 < T and iso- 
tonicity. □ 
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Finally, the induction law r (ap ) < p => r (a*p) < p can be proved as in [6] 
(the LKA does not even need to be strong). 

We now turn to the dual case of the codomain operation. In the KA case 
where we have also right-distributivity, a codomain operation n can easily be de- 
fined as a domain operation in the opposite semiring where, as usual in algebra, 
opposition just swaps the order of composition. But by lack of right distribu- 
tivity this does not work in the LKA setting; we additionally have to postulate 
isotonicity of codomain (in the form of superdisjunctivity to have a purely equa- 
tional axiom). 

Definition 7.7 A left semiring with codomain (a n -semiring) is a structure 
(K, n ), where K is a left test semiring and the codomain operation n : I\ — > 
test(A") satisfies, for all aft £ K and p £ test (AT), 



a < a • a? , 


(cdl) 


(a ■ py < p , 


(cd2) 


(« n • bft < ( abft , 


(cd3) 


(. a + by >a n + F . 


(cd4) 



If K is an LKA, we speak of an LKA with codomain , briefly n -LKA. 

As for domain, the conjunction of (cdl) and (cd2) is equivalent to 

a? < p <=> a < ap , (lrp) 

i.e., a? is the least right preserver of a. However, by lack of right-strictness, 
is not the greatest right annihilator of a; (lrp) only implies 

a? < p ■O- a ■ ->p < a • 0 . (wgra) 

The reverse implication (wgra) =t> (lrp) holds in presence of weak right-distrib- 
utivity 

a = a ■ p + a ■ -*p (wrd) 

and provided a is finite. Note that (wrd) holds automatically for all a £ N. More- 
over, (wrd) is equivalent to full right-distributivity over sums of tests: assuming 
(wrd), we calculate 

a ■ {p + q) = a ■ {p + q) ■ p + a ■ (p + q) ■ ~^p = 
a ■ (p ■ p + q ■ p) + a ■ (p ■ ~^p + q ■ ~>p) = 
a ■ p + a ■ q ■ ~^p < a ■ p + a ■ q . 

The reverse inequation follows from monotonicity and superdisjunctivity. We 
will not assume (wrd) in the sequel, though. 

In an LKA, the symmetry between domain and codomain is broken also in 
other respects. The analogue of (8) does not hold; rather we have 

Lemma 7.8 a 1 = 0 <t=> a £ N. 

Proof. Recall that aeN a = a • 0. Now, by (cdl), a 1 = 0 implies a = a ■ 0, 
whereas the reverse implication is shown by (cd2). □ 
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However, since for domain the proof of preservation of suprema only involves 
isotonicity and (lip), we can carry it over to codomain and obtain 

Lemma 7.9 Codomain is universally disjunctive and hence, in particular, ad- 
ditive and strict. 

Also, the proof of stability of domain uses only (lip) and hence is also valid 
for the codomain case, so that p 1 = p for all p £ test(A"). The import/export law 
(a • pj 1 = a? ■ p follows from (cd3) and stability. Finally, 

Lemma 7.10 In a domain/codomain LKA, a? ■ T) = 0 => a ■ b = a - 0. 

Further properties of domain and codomain can be found in [6] . 



8 Modal LKAs 



Definition 8.1 A modal left semiring is a left test semiring AT with domain and 
codomain. If K in addition is an LKA, we call it a modal LKA. 



Let K be a modal left semiring. We introduce forward and backward diamond 
operators via abstract preimage and image. 





\ a )p = r ( a ' p) , 


(9) 


II 


(10) 


for all a 
duals of 


£ K and p £ test (A), 
the diamonds: 


The box operators 


are, as usual, the de Morgan 




1 a\p = H a)->p , 


(11) 


[a\p= “'(ah P ■ 


(12) 


If a £ N 


then these definitions specialize to 








II 


(13) 


O 

II 


(14) 




1 a\p = -La , 


(15) 


[a Ip = i , 


(16) 



since then also p • a £ N by Lemma 4.3.1. 

In the KA case, diamonds and boxes satisfy an exchange law. Let us work 
out the meaning of the two formulas involved in that law. Using the definitions, 
Boolean algebra and (gla)/(wgra), we obtain 

p < | a] <7 <t=> p < ~^ r (a • —iq) <t=> r (a ■ ->q) < -<p <t=> p ■ a ■ ~^q < 0 



and 



(a\p < q <t=> (p • aft < q ^ p ■ a ■ < a ■ 0 . 



So for finite a we regain the Galois connection 



P < I a\q ( a\p < q , 



which, however, does not hold for a £ N. By an analogous argument one can 
show that also 

p < [a\q ^ | a)p < q 



holds when a £ F. 
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The Galois connections have interesting consequences. In particular dia- 
monds (boxes) of finite elements commute with all existing suprema (infima) 
of the test algebra. 

In the sequel, when the direction of diamonds and boxes does not matter, 
we will use the notation (a) and [a]. For a test p the modal operators satisfy 
(p)q — P ' Q and \p]q = p — »■ q. Hence, (1) = [1] is the identity function on tests. 
Moreover, (0 )p = 0 and [0]p = 1. Finally, in an LKA with converse v we have 
|a w ) = (a| and \af\ = [o|. 

By left-distributivity, the forward modalities distribute over + in the follow- 
ing way: 

I a + b)p = | a)p + | b)p , | a + b\p = (|a]p) • (\b]p) . 

Hence, in a separated test semiring we obtain 

| a)p= |fin a)p + r (inf a) , | a]p = |fin a]p — r (inf a) . 

Using the forward box we can give another characterization of finite elements: 

Lemma 8.2 a £ F •<=>■ |a]l = 1. 

Proof. By the definitions, |a]l = ^ r (a • 0). Now 

tt€ F ■s o-0 = 0 r (a • 0) = 0 ^ r (a • 0) = 1 |a]l = 1. □ 

Further applications of modal operators, notably for expressing Noethericity 
and performing termination analysis, can be found in [7]. 

9 Predicate Transformer Algebras 

Assume a left test semiring ( K , +, •, 0, 1). By a predicate transformer we mean a 
function / : test(A') — » test(A'). It is disjunctive if f{p+ q) = f(p) + f{q) and 
conjunctive if f(p ■ q) = f(p) ■ f(q). It is strict if /( 0) = 0. Finally, id is the 
identity transformer and o denotes function composition. 

Let P be the set of all predicate transformers, M the set of isotone and D 

def 

the set of strict and disjunctive ones. Under the pointwise ordering / < g <£=> 
V P ■ f{jp) < g{p ) , P forms a lattice where the supremum / + g and infimum / n g 
of / and g are the pointwise liftings of + and •, resp.: 

(/ + g)(p) d = f{p) + g(p) , if n g)(p) d = f(p) ■ g{p) ■ 

The least element of P (and M and D) is the constant 0-valued function 0 . 
The substructure ( M. +, 0 , o, id) is an IL-semiring. In fact, o is even universally 
left-disjunctive and preserves all existing infima, as the following calculation and 
a dual one for infima show: 

((U F) o g){x) = (U F)(g(x)) = U F(g(x) = U(Fog)(x) . 

The modal operator |_) provides a left semiring homomorphism from K into M. 
The substructure (D, +, 0 , o, id) is even an idempotent semiring. 
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If test(A") is a complete Boolean algebra then P is a complete lattice with 
M and D as complete sublattices. Hence we can extend M and D by a star 
operation via a least fixpoint definition: 

f* d = gg . id + / o g , 
where g is the least-fixpoint operator. 

Using /z-subfusion (see below) one sees that by this definition M becomes an 
LKA which, however, is not strong. Only the subalgebra of universally disjunc- 
tive predicate transformers is strong. 

def 

Similarly, if test(A') is complete we can define the infinite iteration = 
vg . f o g, where v is the greatest-fixpoint operator. Whereas in M this does not 
imply the omega coinduction law, it does so in D. 

Combining these two observations, we conclude that only the subalgebra 
of universally disjunctive predicate transformers can be made into an w-LKA 
(which is even strong). 

By passing to the mirror ordering, we see that also the subalgebra of univer- 
sally conjunctive predicate transformers can be made into a strong w-LKA; this 
is essentially the approach taken in [21,22]. 

As a sample proof we show that the omega coinduction law holds for dis- 
junctive predicate transformers. First we briefly repeat the fixpoint fusion laws 
(see e.g. [2] for further fixpoint properties). Let F,G,H : L — > L be isotone 
functions on a complete lattice (A, <) with least element A and greatest element 
T. Suppose that G is continuous, i.e., preserves suprema of nonempty chains, 
and assume G(_L) < gH. Then 

GoH<FoG=> G{gh) < gF . (/j-subfusion) 

Suppose now dually that G is cocontinuous, i.e., preserves infima of nonempty 
chains, and assume G(T) > gH. Then 

G o H > F o G => G{vh) > vF . (//-superfusion) 

For the proof of omega coinduction we define F(x) f = / ox + g and G(x) = f 
x + f*og = x + gF and H(x) '= /°i, where x ranges over D. Since we have 
assumed test(A') to be complete, + is universally disjunctive in both arguments, 
so that G is continuous. The coinduction law is implied by uF < G(uH), which 
by z'-superfusion reduces to G o H > F o G. This is shown by 

G(F[(x)) = f o x + gF = f o x + F{gF ) = / o x + / o gF + g = 
fo(x + gF) +g = f o G( x) +g = F(G(x)) . 

Note that this calculation uses finite, but not universal, disjunctivity of / in an 
essential way. For the subclass of universally disjunctive predicate transformers 
over a power set lattice the result is well-known, since they are isomorphic to 
relations [1]. 

It should also be mentioned that the treatment, of course, generalizes to 
functions / : L — > L over an arbitrary complete lattice L. 
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10 Conclusion and Outlook 

We have seen that it is possible to integrate non-strictness with finite and infinite 
iteration as well as with modal operators. This framework allows, for instance, 
an abstract and more concise reworking of the stream applications treated in 
[17]; this will be the subject of further papers. But hopefully the framework will 
have many more applications. 
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Abstract. We show how to introduce demonic and angelic nondeter- 
minacy into the term language of each type in typical programming or 
specification language. For each type we introduce (binary infix) opera- 
tors n and U on terms of the type, corresponding to demonic and angelic 
nondeterminacy, respectively. We generalise these operators to accom- 
modate unbounded nondeterminacy. We axiomatise the operators and 
derive their important properties. We show that a suitable model for 
nondeterminacy is the free completely distributive complete lattice over 
a poset, and we use this to show that our axiomatisation is sound. In 
the process, we exhibit a strong relationship between nondeterminacy 
and free lattices that has not hitherto been evident. Although nondeter- 
minacy arises naturally in specification and programming languages, we 
speculate that it combines fruitfully with function theory to the extent 
that it can play an important role in facilitating proofs of programs that 
have no apparent connection with nondeterminacy. 

Keywords: angelic nondeterminacy, demonic nondeterminacy, free com- 
pletely ditributive lattice. 



1 Introduction 

Specification languages typically include operators n and U on specifications. 
For specifications s o and si, So n Si denotes the disjunction of s o and Si, i.e. 
the specification which requires that either so or si (or both) be satisfied, and 
So U si denotes their conjunction, i.e. the specification which requires that both 
so and Si be satisfied. They occur, for example, in the specification language 
Z [13] as so-called schema disjunction and conjunction, respectively. They have 
analogues in programming languages where they are operationally interpreted 
as demonic and angelic nondeterministic choice, respectively. In the language 
of guarded commands [5], for example, the statement so n si is executed by 
executing precisely one of the statements So and si. The choice is made afresh 
each time so n si is executed, and is made by a demon who chooses vagariously. 
Consequently if we have a particular goal in mind for so n Si then each of So 
and Si must achieve it individually. So U si is also executed by executing one of 
Sq or si, but in this case the choice is made by an angel. The angel knows what 
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we want to achieve and selects whichever of so and si will do so. In this paper, 
we show how n and U may be introduced into specification and programming 
languages at the term level, rather than at the statement level. In other words, 
we will show how n and U may be introduced into each type in a specification 
or programming language such that for t and u any terms of type T, say, t n u 
and f U it are also terms of type T which behave as we have outlined informally. 

There are two reasons why we should do this: to support formal reasoning 
about nondeterminacy, and - more surprisingly - to support reasoning about 
programs and specifications in general, regardless of whether they employ non- 
determinacy or not. We already know the role of demonic choice at the term 
level in specifying and programming, and we know how to formalise it; see [8, 
12, 10, 11], for example. Angelic choice on terms has received less attention and 
even then has usually been given a second class status. In [11], for example, 
functions distribute over demonic choice but little is said with respect to angelic 
choice. Our first contribution is to place both kinds of nondeterminacy on an 
equal footing in a single satisfying theory. 

Nondeterminacy can be seen as a way of populating a type with extra “imagi- 
nary” values, and as a consequence some operators classically regarded as partial 
can become total (or at least less partial). To take a trivial example, a function 
/ has a right inverse / -1 satisfying / o / -1 = Id only when / is bijective (Id 
denotes the identity function). If we admit nondeterminacy, however, only sur- 
jectivity is required. The fact that the inverse may now rely on nondeterminacy 
need not intrude on our reasoning because we typically only exploit the defin- 
ing property. As another example, we know that every monotonic function on a 
complete lattice is rich in fixpoints, and in particular has a least and a greatest 
fixpoint. If we admit nondeterminacy, we no longer require a complete lattice: 
every monotonic function is rich in fixpoints. Our theory of nondeterminacy 
combines seamlessly with function theory, to the extent that we believe that 
nondeterministic functions can play a significant role in formalising program- 
ming techniques that have no obvious connection with nondeterminacy, such as 
data refinement and abstract interpretation. We will explore these issues in a 
later paper, but first we need a general theory of dually nondeterministic types, 
and that is what we set out to achieve here. 

n and U can only give rise to bounded choice. We shall extend them in 
the body of the paper to accommodate unbounded choice. Nondeterminacy is 
always accompanied by a partial ordering relation C, called the refinement re- 
lation. Informally, terms t and u satisfy t C u if t is at least as demonically 
nondeterministic as u, and u is at least as angelically nondeterministic as t. In 
that case we say that u is a refinement of t, or that t is refined by u. We shall 
also axiomatise C. 

A formal treatment of types with demonic choice has already been done, for 
example in [11], and angelic choice can be formalised as its dual. One would 
expect therefore that coalescing the two kinds of nondeterminacy would not be 
difficult. Surprisingly, this turns out not to be at all the case, either with respect 
to the proof theory or the model theory. In particular, dual nondeterminacy 
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requires a much more sophisticated model, which turns out to be a neglected 
construct of lattice theory called the free completely distributive lattice over 
a poset. In constructing this free lattice, we exhibit a more intimate relation- 
ship between nondeterminacy and lattice theory than has hitherto been evident. 
Indeed we conjecture that free completely distributive lattices are the natural 
model for nondeterminacy. 

1.1 Outline of Rest of Paper 

In the next section we describe the syntax of terms when angelic and demonic 
choice is introduced, and describe their semantics informally. We present some 
small examples to help intuition. After that we present the axioms that govern 
nondeterminacy and derive the important algebraic properties of demonic and 
angelic choice. In the penultimate section we address the model theory, the 
greater part of which is constructing the free completely distributive lattice over 
a poset. We conclude with some commentary on the work. 

Our primary contribution is the first axiomatisation of dually nondetermin- 
istic types, complete with a proof of soundness. A secondary contribution is 
revealing a more intimate relationship between nondeterminacy and lattice the- 
ory, and in particular that a certain class of free lattices are the natural model 
for nondeterminacy. 

2 Syntax and Informal Semantics 

2.1 Terms and Finite Choice 

We assume a typed specification/programming language. We use T, U, ... to 
stand for types, and t, u, v ... to stand for terms. We write t,u : T to assert 
that terms t and u are of type T, and similarly for other than two terms. For 
terms t and u of type T, say, t n u and fUw are also terms of type T with the 
informal meanings described in the introduction. It should be intuitively clear 
that both operators are associative, commutative, and idempotent; we will use 
associativity to omit some bracketing from the start. We intend that FI and U 
should distribute over one another in all possible ways. 

A consequence of our semantics is that operators distribute over n and LI. For 
example, in (2 n 3) — (2 n 3) the first occurrence of (2 n 3) may have outcome 2, 
while the second may have outcome 3, or vice versa, and hence (2 113) — (2 113) = 
— 1 n 0 FI 1 which is consistent with subtraction distributing over n. 

Operators have the following relative precedence, higher precedence first (the 
list anticipates some operators we have yet to introduce): 

function application 

n u a v 

n u 

= ^ < E 



A 



V 
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For example, the brackets in the following are superfluous: 

((tn«)E|J*) <=► ((iEU x ) v ( M ElJ-Y)) 

2.2 Ordering on Terms 

We assume each type T in the language comes equipped with a partial ordering 
<t (we omit the type subscript when it can be inferred from context or is not 
significant). In the case of base types such as the integers or booleans, the partial 
ordering will nearly always be the discrete ordering, i.e. x < y iff x = y. If the 
reader has in mind a type with no obvious partial ordering, then it can be trivially 
ordered by the discrete ordering. For every type T we introduce a refinement 
relation Ct on terms of type T (again, we usually omit the type subscript). We 
intend that Ut agrees with <r on terms that employ no nondeterminacy, and 
otherwise t u holds iff t is at least as demonically nondeterministic as u, and 
u is at least as angelically nondeterministic as t, where t, u : T . For example, 
all of the following hold: 2 C 2, 2 n 3 n 4 C 2 n 4, 2 U 4 C 2 U 3 U 4, and 
(2n3)U4C2U4U5. We might expect from lattice theory that refinement and 
choice are related via t^u<=$tr\u = t<=$tUu = u, and that will turn out to 
be the case. 

2.3 Unbounded Choice 

To accommodate unbounded demonic choice we introduce the term (jlx:T\P • t). 
Here variable x is bound by n and may occur free in predicate P and term t 
(playing the role of a term of type T). (nx:T|P • t) is equivalent to the demonic 
choice over terms i[x\z] for each i £ T satisfying P[x\z]. Above and elsewhere, 
t[x\zz] denotes term t in which term u is substituted for x, with the usual caveats 
about avoiding variable capture, and similarly for substitution in predicates. 
(nx:T|P • t) has the same type as t. For example, (rix:Z|0 < x < 3 • x U 5) has 
type Z and is equivalent to (0 U 5) n (1 U 5) n (2 U 5). Simple abbreviations include 
(nx:T|P) for (rix:TjP-x), and (n x:T ■ t ) for {Ux:T\true • t) where true stands for 
some theorem. (Ux:T|P-t) and its obvious abbreviations are defined analogously. 
Observe that where precisely two values of type T satisfy P - call them j and 
k - then (nx:T|P) is equivalent to j n k, and analogously for (Ux:T|P). When 
just a single value k of type T satisfies P, then both (nx:T|P) and (Ux:TjP) are 
equivalent to k. 

Let false stand for -> true . ( Ux:T\false ) is given the special name It (pro- 
nounced bottom ), and (nx:T|/aZse) is given the name Tt (pronounced top). 
(Ux:T\true) is given the special name sorrier , and (nx:Tjtrue) is given the name 
allr ■ Again, we commonly omit the type subscripts. As suggested by their names, 
_L and T satisfy 1 C t C T for every term t. It follows that _L is the unit of U 
and the zero of n, and T is the unit of n and the zero of U. 

2.4 Examples 

We exercise the notation on a few small examples. To begin, the function Az:Z • 
(rix:ZxZ|aci<ix = z) is a right inverse of add, the addition function on ZxZ. So 
also is Az:Z • (Ux:ZxZ|ad<ix = z). 
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Suppose the type PhoneBook is comprised of all relations from type Name 
to type PhoneNumber. Then the following function looks up a person’s phone 
number in a phone book: 

lookUp = Aiv.Name- Xb:PhoneBook ■ (r\x:PhoneNumber\(n, x) G b). 

For a more interesting example, consider the function 
leastWRTr : (T — > Z) — > PT — > T 

which takes as arguments a function / and a set s, and selects some element a 
of s such that fa is minimized. We omit the type subscript in what follows. 
Formally, for any type T: 

leastWRT = A f:T -► Z • As:PT • ( Ux:T\x G s A ( Vy:s ■ f x < f y)). 

To illustrate its use, we make a function which yields a place to which two 
people should travel if they wish to meet as soon as possible. We assume a type 
Place whose elements represent all places of interest, and a function time: Place x 
Place — > N which yields the travelling time (in minutes, say) between any two 
places. The function we want is: 

Ame, you: Place ■ leastWRT (Ac: Place - time(me,c) max time(you,c)) Place. 

(We have taken the liberty of overloading Place so that it also denotes the set 
of values in type Place.) 

Many games can be expressed using a combination of angelic and demonic 
nondeterminacy [1], Consider the classical game Nim, for example, in which 
two players alternately remove from 1 to 3 matches, say, from a pile until none 
remain. The player who removes the last match loses. Let us refer to the players 
as the home and away player, respectively, where the home player goes first. 
A single move made by each player is represented by the functions moveH and 
moveA , respectively: 

moveH , moveA : N — » N 

moveH n ( moveA n) yields the number of matches remaining after the home 
(away) player has made one move when offered n matches. We introduce the two- 
value type Player with elements Home and Away. The complete game played 
by each player is represented by functions playH and playA , respectively: 

playH, play A : N — > Player 

playH n ( playA n ) yields the winner of a game in which the home (away) player 
is initially offered n matches. Formally: 

playH = An:N- if n = 0 then Home 

else (playA o moveH) n 
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play A = An:N- if n = 0 then Away 

else ( playH o moveA) n 

It only remains to code moveH and moveA in whatever way we choose. We 
choose to make the home player play the best possible game. No knowledge of 
Nim strategy is needed, as it suffices to offer all possible moves to the angel who 
will choose the best one: 

moveH = An:N • (Um:N|0 < n — m < 3) 

The away player sets out to thwart the home player’s ambition, by offering him 
the demon’s choice among all possible moves (and so obliging him to counter all 
of them): 

moveA = An:N • (nm:N|0 < n — m < 3) 

There is a winning strategy for the opening player in Nim with n matches initially 
(n > 0), iff the home player as we have coded it always wins, i.e. iff the following 
holds 

Home C playH n 

We write Home C playH n rather than Home = playH n because playH yields 
an angelic choice of possibilities of which only one need be Home. This example 
is interesting because it employs both kinds of nondeterminacy to express a 
property ( “there is a winning strategy for the opening player in Nim ” ) that has 
no evident connection with nondeterminacy. 

2.5 Quantifications and Nondeterminacy 

In introducing nondeterminacy, we have to make clear whether instantiation 
extends to terms that rely essentially on nondeterminacy. It is analogous to 
the situation in partial function theory in which we explicitly accommodate 
“undefined” terms such as 3/0. We expect that from (Vn:Z • x — x = 0) we can 
infer 3 — 3 = 0, but probably not 3/0 — 3/0 = 0. We adopt a similar convention 
here: from (Vn: Z • x — x = 0) we may infer 3 — 3 = 0, but not J_z — Tz = 0 or 
(2 n 3) — (2 n 3) = 0. In other words, we adopt the convention that the range of 
the bound variable does not extend to nondeterministic terms. We extend this 
convention to all quantifications. For example, (Ux:Z\x = 1 n 2) is equivalent to 
±z, not 1 FI 2. 

3 Proof Theory 

In presenting the axioms, we employ more syntactically abstract forms of de- 
monic and angelic nondeterminacy, viz. [" | S and |J S for S any set of terms. 
These denote the demonic and angelic choice, respectively, over the constituent 
terms of S. tr\u is equivalently | |{i, u} and (r\x:T\P-t) is equivalently \~\{x:T\P-t} 
(and similarly for angelic choice). Above and elsewhere we write {a::T|P • t } to 
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denote the set of t[x\i] for each i £ T satisfying P[x\*]; {x:T|P} is short for 
{x:T|P • x}, and {x:T ■ t} is short for {x:T\true ■ t}. We write (VAC T • • •) as an 
abbreviation for (VA:PP • • •), and similarly for other quantifications. We write 
{xGX -t} as an abbreviation for {x:T\xGX -t} where X:PT. We denote the set of 
terms of type T by terrrvr (which includes terms that employ nondeterminacy) . 

The axioms have to define not just demonic and angelic nondeterminacy, but 
refinement as well. Firstly, C agrees with < on the deterministic values in each 
type T. 

C-ext: (Vx, y:T ■ x C y <=> x < y) 

Refinement is antisymmetric. 

C-asym: tQuAuQt=At = u 

The following relate refinement to demonic and angelic choice, respectively, 
where t, u : T. 

c-n: (VXct • n X C t => n X C u) 

C-U: tQu ^ (VXC T -uOlJX=>tOlJX) 

The axioms for demonic and angelic choice follow, where X C T, S C terrriT- 

n-defn: «=► (dfeS • t E U x ) 

U-defn: |~| X E U >5 4=4- (3 t€S • |~| X Qt) 

The remaining axioms are just definitions. The following define n and U. 

n-defn: t n u = | |{t, w} 

U-defn: tUu= |J{t, u} 

Finally we define the constants. 

T-defn: Tt = U 0 

T -defn: T t = fl® 

aZZ-defn: allr = f~| T 

some-defn: somex = jj T 

This completes the list of axioms. Just to reassure that we have catered for 
(nx:T|P • t) and (Ux:TjP-f), below is [J-defn with S instantiated as {x:T\P-t}: 

I - ] X C {Ux:T\P-t) (3x:T ■ P A{\~\X Ht)) 

All the properties we would expect of n, U, and C follow from the axioms; 
see Figure 1 (in which t,u,v : T, X C T, and S C termr). 

It is clerical routine to prove the theorems in the order presented. We prove 
all n t t ^ 1 as an example. 
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all C t 



all- defn, C-U 

(\/XCT •iEU^=^n r ELI^) 

Pl-defn 

(VXCT-tQ\_\X^ (3y:T-yr\JX)) 

nM = y 

(VXCT • t Q u X =► (3 y:T • f| M E U X)) 
U-defn, f|M = V 

(VXQT ■ t Q U X => (3y:T • (3a;eX • y C z))) 
C reflexive, logic 

(\/xct • t c LJi^-i/0) 

logic 

(VXCT .I = 0^n(tCLJI)) 
logic 

*(i EU®) 

.. -del'n. iCf 



4 Model Theory 

4.1 Lattice Fundamentals 

In this section and the two following we give a short dense summary of lattice 
theory to the extent that we need it to construct a model for our axioms. Ev- 
erything is standard and is available in more detail in any standard text (such 
as [4,3]). 

A lattice L is a partially ordered set (we’ll use < to represent the partial 
ordering) such every finite subset of L has a greatest lower bound in L with 
respect to <, and similarly has a least upper bound in L with respect to <. L is 
a complete lattice if greatest lower and least upper bounds are not restricted to 
finite subsets of L. For SQL , the greatest lower bound of S is denoted by /\ S, 
and the least upper bound is denoted by V S. Greatest lower bounds are also 
called meets , and least upper bounds are also called joins. A complete lattice is 
completely distributive iff meets distribute over joins, and vice versa. When we 
say that a lattice is completely distributive, we mean to imply that it is also 
complete. 

A function / from poset (C, <c) to poset(I?, <d) is monotonic iff x <c y => 
fx <d fy for all x, y € C, and an order embedding iff x <c y ^==> / x <d f y 
for all x,y € C. An order embedding from (C, <c) to a complete lattice is said 
to be a completion of (C, <c)- 

Let / be a function from complete lattice L to complete lattice M . f is /\- 
distributive iff f(/\S) = /\ (f S) for all SQL ; \J -distributive is defined dually. 
By f S above and in what follows, we mean the image of S through /, i.e. 
{xGS 1 • f x}. f is a complete homomorphism if / is /\- and V-distributive. Now 
let / be a function from poset C to lattice M; C may well be a lattice, but in 
any event subsets of C may have greatest lower bounds and least upper bounds 
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Fundamental theorems 

t = u <=> (VXCT • n x C t <=> n X C it) 
t = u (VXCT •iCU-’ f «=> uEU x ) 



nit} = t 

urn = t 

l"l, U are symmetric, associative, and idempotent 
C is a partial order 

t n u c U x <=> tc|jxvucux 
n x □ t u u <=> n*Etvn*c« 

(Cm fit) -t=t- (CmAICd 
«Ut)C( •<=>• uCJAtiCt 

tcpis 1 (Vm es • t c u) 

Ljsct (VueS'uct) 

(Cm •<=>■ in« = t 

(Cm 4=^ JUti = ti 

n, U distribute over one another, as do n.u 

ICiCT 

all C t -t=t- t ^ ± 

t C some -4=> t ^ T 

tni = i, tnT = Mul = i,tuT = T 



t = (nxcTjt c □ x • U x) 

t= (U.YCT P| X Ci-ni) 

Fig. 1. Fundamental theorems ( t,u,v : T, X C T, S' C termr) 



in C. We say that / is existentially f\- distributive iff f(/\S) = f\(f S) for all 
SCC such that /\S exists in C; existentially \J -distributive is defined dually. 



4.2 Upsets and Downsets 

Let (C, <) be a poset. A set S C C is said to be downclosed (with respect to 
<) iff ( \/x,y£C\x <y-y£S=>x£ S), and upclosed (with respect to <) iff 
(Vx, yeC \x < y-x € S => y € 5). We denote by V C the set of downclosed subsets 
(or downsets) of C, and by UC the set of upclosed subsets (or upsets). VC is a 
completely distributive lattice partially ordered by C and with joins and meets 
given by unions and intersections, respectively. UC is a completely distributive 
lattice partially ordered by 3 and with joins and meets given by intersections 
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and unions, respectively. For S C C, we denote by Sj the downclosure of S, 
i.e. the smallest downset containing S (this is well defined because downsets are 
closed under intersection). We denote by Sf the upclosure of S, i.e. the smallest 
upset containing S. The function down:C — » VC which maps c to {c}j is a 
completion of C, as is up :C —*UC which maps c to {c}t, where the ordering on 
VC and UC is as given above. 

4.3 Denseness and Primeness 

The definitions in this section are needed primarily in proofs, and can be lightly 
passed over if it is not intended to follow the proofs. 

A completion f:C — » L is said to be join dense iff x = \/{cGC\f c < x-fc } for 
all x € L. f is meet dense iff x = /\{cGC \x < f c ■ f c} for all x £ L. Join dense 
completions are existentially /\-distributive, and meet dense completions are 
existentially \/ _ distributive. An element a; of a complete lattice L is completely 
join prime iff x < V S <t=> (Hy&S ■ x < y) for all S C L. x is completely meet 
prime iff f\S < x <==>■ (3 y£S ■ y < x) for all S C L. A completion / : C — > L 
is said to be completely join prime (completely meet prime) if every x € f C is 
completely join prime (completely meet prime, respectively), up is meet dense 
and completely meet prime, and down is join dense and completely join prime. 



4.4 Free Completely Distributive Lattice over a Poset 

A completely distributive lattice L is called the free completely distributive lattice 
over a poset C iff there is a completion <f>:C L such that for every completely 
distributive lattice M and monotonic function f:C — » M there is a unique 
complete homomorphism f£:L — > M satisfying / = /| o (f>. In this context we 
say that <p is a free completely distributive completion with simulator _ J. 

Definition 1. Let 4>:C — > L be a completion of poset C into complete lattice 
L. For any monotonic f:C — > M , where M is a complete lattice, we define f^, 
f£:L -► M by 

(i) f$ = Ae:L • \f{ceC\(j)C <x- f c} 

(ii) = \x:L ■ /\{ceC\x < <j>c- f c}. □ 

It is easy to see that both f^ and f£ are monotonic. The central theorem that 
allows us to construct the free completely distributive lattice over a poset is the 
following. 

Theorem 1. Let <f>:C — > L be a completion of poset C into complete lattice L, 
and d:L — ► M be a completion of L into completely distributive lattice M. 

(i) If 9 is join dense and completely join prime, and f> is meet dense and com- 
pletely meet prime, then 9 o <j> is a free completely distributive completion of C 
with simulator 

(ii) If 9 is meet dense and completely meet prime, and <j> is join dense and 

completely join prime, then 9 o <f> is a free completely distributive completion of 
C with simulator (_^)g □ 
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We shall prove Theorem 1 shortly. We need only exhibit a 9 and <p as described 
in the theorem to have a free completely distributive lattice. By a standard 
argument on free objects, all free completely distributive lattices on a given 
poset are isomorphic. We denote by FCD(C, <) the free completely distributive 
lattice on poset (C, <). 

Theorem 2. FCD(C, <) exists for every poset ( C , <). 

Proof. Apply Theorem 1 (i) with 9 instantiated as down and <p instantiated as 
up, or Theorem 1 (ii) with 9 instantiated as up and <p instantiated as down. □ 

The rest of this section is devoted to proving Theorem 1, via a series of lemmas 
(Lemma 5 will play a second role in showing soundness). 

Lemma 1. Let <p\C — > L be a completion of poset C into complete lattice L, 
and let f:C — > M be monotonic. Then f = f^ o <p and f = f£ o <p. 

Proof. Easy exercise, needing only that / is monotonic and <p is an order embed- 
ding. □ 



Lemma 2. Let <p:C L be a completion of poset C into complete lattice L, 
and let f:C — > M be monotonic. 

(i) If <p is completely join prime then f^ is \J -distributive. 

(ii) If (p is completely meet prime then f£ is /\- distributive . 



Proof. We prove (i). Let SCL. 

f^VS) 

<=> defn /v 

\f{ceC\j)c< \/S-fc] 

<p completely join prime 
\J{c&C\{3y&S ■ cpc<y) ■ f c} 
set theory 

V(U{yG>5' • {ceC\<pc < y ■ fc}}) 
V distributes over IJ 
ViytS ■ \/{c£C\(pc< y ■ f c}}) 
4=4- defn /v 

waves -f%y} 

4=4- defn of image function 

wins) 



□ 



Lemma 3. Let 9:C — ► L be a completion of poset C into complete lattice L, 
and let f:C — > M be monotonic with M a complete lattice. 

(i) If 9 is join dense then fg inherits the f \-distributivity of f, i.e. whenever 
f distributes over /\S, fg distributes over /\(9 S) for all S C C such that /\S 
exists. 

(ii) If 9 is meet dense then fg inherits the \J -distributivity of f 
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Proof. The proof of (i) is an easy inference from Lemma 1 and that fact that <j> 
is /^distributive (because it is join dense) . The proof of (ii) is dual. □ 

We need notation to capture arbitrary distributions of meets over joins. Let I 
denote a set and let A denote a set for each i £ I. We denote by [Li— >Jj] the 
set of functions / from / to (J{i £ I ■ A} satisfying / i £ J, for each i £ I. That 
complete lattice L is completely distributive is expressed as follows: 

f\{i£! ■ \/{j£ J i ' Xij}} = \/{f£[i:I^>Ji] ■ /\{i£l ■ Xij i}} 

where £ L for each i £ I and j £ Ji. We use the notation in the following 
lemma which is the crux of proving Theorem 1. 



Lemma 4. Let 9:L — > M be a completion where L is a complete lattice and 
M is a completely distributive lattice, and let f:L — » N where N is a complete 
lattice. 

(i) If 9 is join dense and f is f\- distributive, then so also is ff /\- distributive. 

(ii) If 9 is meet dense and f is V -distributive, then so also is fg \/ -distributive. 



Proof. We prove (i). Let S C M. 

ft(AS) 

set notation 



ti(A{s£S-s}) 

9 join dense 

f(((/\{s£S -\J{k£L\9k< s-9k}}) 
define L s = {k£L\9k < s} 
fe(A {s£S-\J{k£L s -0k}}) 

M is completely distributive 
f(((\/{h£[s:S^L s } ■ A {s£S ■ 9{h a)}}) 

fg is V _ distributive as 9 join dense 
\/ {h £[s:S^L s ] ■ f(AA{s£S ■ 0(hs)})} 

Lemma 3 (/ is A'distributive, 9 join dense, 

M completely distributive) 

\/ {hG[s:S ^L s } ■ A {s£S ■ fg (9(h s))}} 

M is completely distributive 

A {s£S-\f{k£L s .fZ(ek)}} 

fg is V'distributive 

A {s£S-ft(y{k£L a -8k)})} 

repeating the operations in steps 2-4 above, in reverse order 
A {s£$- fes} 

function image notation 

a (.as) 



□ 



Lemma 5. Let <j>:C — > L be a completion of poset C into complete lattice L, 
and 9:L — > M be a completion of L into complete lattice M. Then for all p £ M 
(i) p = \/ {<S':1PC'| A (6(4>S)) < p ■ AWS))} if 9 is join dense and 4> is meet 
dense 
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(it) p = /\{5:PC|p < \f(9(<j)S)) ■ \/ (9(<j) S))} if 9 is meet dense and (j> is join 
dense. 

Proof of (i). As 9 is join dense, p = \/{q:L\9 q < p- 9 q}. As <j) is meet dense, every 
q £ L is equivalent to /\((/>S) for some S C C, and hence L = {S':PC • /\(<f>S)}. 
Therefore p = \f {S:PC\9(f\((j) S)) < p- 9(/\(<j> S))}. As 9 is a join dense function 
on a complete lattice it is /\-distributive, and hence p = VI&PC'I f\(9((j)S)) < 
p-mts))}. □ 

We now have all we need to prove Theorem 1. 

Proof of Theorem 1: We prove part (i). It is a simple exercise to show that 9ocj) 
is an order embedding from C to M and hence a completion. Let f:C — » N be 
monotonic with N a completely distributive lattice. From two simple applications 
of Lemma 1 we have (f£)g o(9o(j>) = /. As 9 is completely join prime, it follows 
from Lemma 2 that (f£)g is V _ distributive. As (f> is completely meet prime 
it follows from Lemma 2 that is /\-distributive, and hence from Lemma 4 
that (f£)g is /^-distributive. Therefore (f£)g is a complete homomorphism. By 
Lemma 5 (i) C generates M, i.e. every element in M is obtained by taking meets 
and joins of elements in ( 9 o <p) C . As all candidate complete homomorphisms by 
definition agree on (9 o cj>) C and are distributive in all possible ways, it follows 
that (f£)g is unique. □ 

4.5 Soundness 

We show that the free completely distributive lattice is a model for our axioms. 
We presume that in the absence of nondeterminacy, each type T is represented 
by a partially ordered set ([T], <r T i) whose ordering agrees with <t, i.e. [f] <[^] 
[it] iff t <t u where [t] is the denotation of t : T in [T], and similarly for 
it : T. To accommodate nondeterminacy, we elevate the representation of T 
from ([T], <[t]) t° FCD([T], <[t]) represented by the up-down lattice V (U [T]), 
and we elevate the representation of each term t from [t] to (down o up)[f]. The 
refinement relation, demonic nondeterminacy, and angelic nondeterminacy in 
T are represented by the ordering relation, the meet operation, and the join 
operation in V (U [T]), respectively. 

Our six axioms translate to the model as follows, where t,u £ V(U[T}), 
X C [T] and S C V(U[T}): 

C-ext-M: 

C-asym-M 

C-n-M: 

C-U-M: 
fj-defn-M: 

[J-defn-M: 

C-ext-M and C-asym-M obviously hold: the former says that down o up is an 
order embedding, and the latter that < in D(U[T]) is antisymmetric. 

C-n-M follows easily from Lemma 5 (i) in which is replaced by down and 
9 by up. C-U-M follows similarly from Lemma 5 (ii). 



(Vx, y.[T] • (down o up) x < (down o up) y •<==>• x <y T ] y) 

■. t < u Au <t => t = u 

t < u 4==^ (VAC[T] ■ /\((down o up)A) <t=> /\((down o up)X) < u ) 
t < u <==> (VAC[T] • t < \/((down o up)X) => u < \J (( down o up)A')) 
/\5'<V(( down o up)A) 4=> (BsGS 1 • s < V((down o up)A)) 

/\((down o up)A) <\/S <=> (3s£S • /\((down o up)X) < s) 
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We prove |_|-defn-M as follows. 

/\((down o up)X) <\J S 
•<=>• down is /\-clistributive 

down(/\(upAi)) < V S 
-<=>■ down is completely join prime 

(3sGS' • down(/\(upX)) < s) 

•<=>■ down is /\-distributive 

(3sGS' • /\((down o up)JT) < s) 

Dually, |~~|-defn-M holds in the down-up model, i.e. IA{D[T\) which is isomorphic 
to the up-down model. The property transfers trivially from the down-up model 
to the up-down model using the isomorphism that maps (upodown) X to (downo 
up) X for any X C [T\. 

5 Conclusion 

We have presented axioms governing angelic and demonic nondeterminacy in 
terms at the type level, and constructed a model to show their soundness. 
FCD(C, <) seems to capture unbounded nondeterminacy so closely that we 
are tempted to say that the theory of nondeterminacy is more or less the theory 
of free completely distributive lattices over a poset. Indeed, once FCD(C, <) is 
constructed, proving soundness of the axioms is almost trivial. 

The free completely distributive lattice over a poset seems to be a neglected 
part of free lattice theory. At least, it does not merit a mention in the ency- 
clopaedic [6]. Markowsky [9] described it, without proof, for the special case 
that the poset is discretely ordered, and Bartenschlager [2] constructed a repre- 
sentation of FCD(C, <) in a very different setting as a certain “concept lattice” 
[7]. The construction above is original as far as we know. It has been guided by 
our need to facilitate a proof of soundness of the axioms. Indeed, the model and 
axioms were constructed together because a satisfactory axiomatisation proved 
elusive initially. 

Our interest in giving a satisfactory mathematical account of nondeterminacy 
was motivated originally by the role of nondeterminacy in specifications and 
programs. However, we believe the theory will combine elegantly with function 
theory to yield a richer theory of functions in which some operations become 
less partial. This will impact favourably on various programming techniques 
that have no obvious connection with nondeterminacy, such as data refinement 
and abstract interpretation. We are currently exploring these possibilities . 
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Abstract. Erasure of information incurs an increase in entropy and 
dissipates heat. Therefore, information-preserving computation is essen- 
tial for constructing computers that use energy more effectively. A more 
recent motivation to understand reversible transformations also comes 
from the design of editors where editing actions on a view need to be 
reflected back to the source data. In this paper we present a point-free 
functional language, with a relational semantics, in which the program- 
mer is allowed to define injective functions only. Non-injective functions 
can be transformed into a program returning a history. The language 
is presented with many examples, and its relationship with Bennett’s 
reversible Turing machine is explained. The language serves as a good 
model for program construction and reasoning for reversible computers, 
and hopefully for modelling bi-directional updating in an editor. 



1 Introduction 

The interest in reversible computation arose from the wish to build computers 
dissipating less heat. In his paper in 1961, Landauer [17] noted that it is not 
the computation, but the erasure of information, that generates an increase in 
entropy and thus dissipates heat. Since then, various models of computation that 
do not erase information, thus capable of reversely construct the input from the 
output, have been proposed. Lecerf [18] and Bennett [4] independently developed 
their reversible Turing machines. Toffoli [26] proposed an information preserving 
logic, in which traditional logic can be embedded. Fredkin and Toffoli [11] then 
presented their “ballistic computer”, which dramatically diverts from typical 
computers and instead resembles movement of particles, yet computationally 
equivalently to reversible Turing machines. Recent research actually attempts 
to build VLSI chips that do not erase information [27]. Due to the interest in 
quantum computing, reversible computation has recently attracted a wide range 
of researchers [24] . 

As was pointed out by Baker [3], were it only a problem in the hardware, we 
could compile ordinary programs for reversible computers and hide the reality 
from the programmers. Yet it is actually the highest level of computation where 
the loss of information is the most difficult to deal with. It is thus desirable to 
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have programming languages that have more structure - a language designed 
for reversible computation. 

Our interest in reversible computation came from yet another area. In our 
Programmable Structured Documents project [25], we are developing a struc- 
tural editor/ viewer for XML documents with embedded computations. The 
source document is transformed into a view in ways defined by the document de- 
signer. When a user edits the view, the changes need to be reflected back to the 
original source. It is thus preferable that the transformation be programmed in 
a language where only reversible transformation is allowed. When non-reversible 
computation is needed, it should be made explicit what extra information needs 
to be remembered to perform the view-to-source transformation. Dependency 
among different parts of the view shall also be made explicit so the system 
knows how to maintain consistency when parts of the view are edited. 

In this paper we will present a point-free functional language in which all 
functions definable are injective. Previous research was devoted to the design of 
models and languages for reversible computation [26,4,5,19,3,10,28]. Several 
features distinguish our work from previous results. Since it is a language in 
which we want to specify XML transformations, we prefer a high-level, possibly 
functional, programming language. It has a relational semantics, thus all pro- 
grams have a relational interpretation. While some previous works focus more 
on the computational aspect, our language serves as a good model for program 
construction, derivation and reasoning of reversible programs. We have another 
target application in mind: to extend the language to model the bi-directional 
updating problem in an editor, studied by [21, 14]. We hope that the relational 
semantics can shed new light on some of the difficulties in this field. 

In the next three sections to follow, we introduce the basic concepts of re- 
lations, a point-free functional language, and finally our injective language Inv, 
each of which is a refinement of the previous one. Some examples of useful injec- 
tive functions defined in Inv are given in Section 5. We describe how non-injective 
functions can be “compiled” into Inv functions in Section 6, where we also discuss 
the relationship with Bennett’s reversible Turing machine. Some implementation 
issues are discussed in Section 7. 

2 Relations 

Since around the 80’s, the program derivation community started to realise that 
there are certain advantages for a theory of functional programming to base on 
relations [2,6]. More recent research also argued that the relation is a suitable 
model to talk about program inversion because it is more symmetric [22] . In this 
section we will give a minimal introduction to relations. 

A relation of type A —> B is a set of pairs whose first component has type A 
and second component type B . When a pair (a, 6) is a member of a relation R , we 
say that a is mapped to b by R. A (partial) function 1 , under this interpretation, 

1 For convenience, we refer to possibly partial functions when we say “functions”. 

Other papers may adopt different conventions. 
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is a special case of a relation that is simple - a value in A is mapped to at most 
one value in B. That is, if (a, 6) £ R and (a, b') £ R, then b = b'. For example, 
the function fst :: (A x B) — > A extracting the first component of a pair, usually 
denoted pointwisely as fst (a, b) = a, is defined by the following set: 

fst = {((a, 6), a) | a £ A A b £ Bj 

where (a, b ) indeed uniquely determines a. The function snd :: {A x B) — > B is 
defined similarly. 

The domain of a relation R : : A — > B is the set {a £ A \ 3b £ B :: (a, b) £ R}. 
The range of R is defined symmetrically. The converse of a relation R, written 
R°, is obtained by swapping the pairs in R. That is, 

(6, a) £ R° = (a, b) £ R 

An injective function is one whose converse is also a function. In such cases a 
value in the domain uniquely defines its image in the range, and vice versa. The 
term inverse of a function is usually reserved to denote the converse of an injective 
function. Given relations R :: A — > B and S :: B —> C, their composition R-, S is 
defined by: 

R; S = {(a, c) | 3 b :: (a, b) £ R A ( b , c) £ 5 1 } 

The converse operator ° distributes into composition contravariantly: 

(f?; S)° = S°] R° 

Given two relations R and S of the same type, one can take their union RUS 
and intersection RnS. The union, when R and S have disjoint domains, usually 
corresponds to conditional branches in a programming language. The intersec- 
tion is a very powerful mechanism for defining useful relations. For example, the 
function dup a = (a, a), which duplicates its argument, can be defined by: 

dup = fst° fl snd° 

Here the relation fst°, due to type constraints, has to have type A — > (A x A). 
It can therefore be written as the set {(a, (a, a')) \ a £ A A a' £ A} that is, 
given a , fst° maps it to (a, a') where a' is an arbitrary value in A. Similarly, 
snd° maps a to (a 7 , a) for an arbitrary a'. The only point where they coincide 
is (a, a). That is, taking their intersection we get 

dup = {(a, (a, a)) \ a £ A} 
which is wlrat we expect. 

The converse operator ° distributes into union and intersection. If we take 
the converse of dup, we get: 

dup° = fst fl snd 

Given a pair, fst extracts its first component, while snd extracts the second. The 
intersection means that the results have to be equal. That is, dup° takes a pair 
and lets it go through only if the two components are equal. That explains the 
observation in [12] that to “undo” a duplication, we have to perform an equality 
test. 




292 Shin-Cheng Mu, Zhenjiang Hu, and Masato Takeichi 



3 The Point-Free Functional Language Fun 

The intersection is a very powerful construct for specification - with it one can 
define undecidable specifications. In this section we will refine the relational con- 
structs to a point-free functional language which is computationally equivalent 
to conventional programming languages we are familiar with. The syntax of Fun 
is defined by 2 : 

F ::= C | C° 

\F-F\id 
I (F,F) | fst | snd 
| FUF 
I H(X: F x ) 

C ::= nil \ cons \ zero \ succ 

The base types of Fun are natural numbers, polymorphic lists, and Unit, the 
type containing only one element (). The function nil :: B — > [A] is a constant 
function always returning the empty list, while cons :: (Ax [d] j — > [A] extends a 
list by the given element. Converses are applied to base constructors (denoted by 
the non-terminal C ) only. We abuse the notation a bit by denoting the converse 
of all elements in C by C°, and by Fx we denote the union of F and the set of 
variable names X . The converse of cons, for example, decomposes a non-empty 
list into the head and the tail. The converse of nil matches only the empty 
list and maps it to anything. The result is usually thrown away. To avoid non- 
determinism in the language, we let the range type of nil° be Unit. Functions 
zero and succ are defined similarly. Presently we do not yet need a constructor 
for the type Unit. 

The function id is the identity function, the unit of composition. Functions 
fst and snd extract the first and second components of a pair respectively. Those 
who are familiar with the “squiggle” style of program derivation would feel at 
home with the “split” construct, defined by: 

if, 9) a= (/ a, g a) 

The angle brackets on the left-hand side denote the split construct, while the 
parentheses on the right-hand side denote pairs. This definition, however, as- 
sumes that / and g be functional. Less well-known is its relational definition in 
terms of intersection: 

(f,g) =f',fst° n g-snd° 

For example, the dup function in the previous section is defined by (id, id). 

With fst, snd and the split we can define plenty of “piping functions” useful 
for point-free programming. For example, the function swap :: (AxB) — » (BxA), 

2 In Fun there are no primitive operators for equality or inequality check. They can be 
defined recursively on natural numbers, so we omitted it from the language to make 
it simpler. In Inv, however, we do include those checks as primitives since they are 
important for inversion. 
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swapping the components in a pair, and the function assocr :: ((A x B) x C) — » 
(A x (B x C)), are defined by: 

swap = ( snd,fst ) 

assocr = ( fst;fst , (/st; snd, snd)) 

The function assocl :: (A x (B x C)) — > ((A x B) x C) can be defined similarly. 
The “product” functor (f x g), on the other hand, is defined by 

(/ x 5) (a, b) = (f a, gb) 

Squigglists are more familiar with its point-free definition: 

(/ x g) = ( [fst;f,snd;g ) 

Union of functions is still defined as set union. To avoid non-determinism, 
however, we require in fUg that / and g have disjoint domains. Arbitrary use of 
intersection, on the other hand, is restricted to its implicit occurrence in splits. 

Finally, /. iF denotes the unique fixed-point of the Fun-valued function F, 
with which we can define recursive functions. The important issue whether a 
relation- valued function has an unique fixed-point shall not be overlooked. It 
was shown in [9] that the uniqueness of the fixed-point has close relationship 
with well-foundness and termination. All recursive definitions in this paper do 
have unique fixed-points, although it is out of the scope of this paper to verify 
them. 

As an example, the concatenation of two cons-lists is usually defined recur- 
sively as below: 

[] -H- 2/ =V 

(a : x) 4F y = a : (x 4F y) 

Its curried variation, usually called cat :: ([A] x [A]) — ■> [A], can be written in 
point-free style in Fun as: 

cat = n{X: (■ nil° x id)', snd U ( cons° x id)', assocr', (id x A); cons) 

The two branches of U correspond to the two clauses of -H-, while ( nil° x id) 
and ( cons° x id) act as patterns. The term ( cons° x id) decomposes the first 
component of a pair into its head and tail, while (nil° x id) checks whether that 
component is the empty list. The piping function assocr distributes the values 
to the right places before the recursive call. 

We decided to make the language point-free because it is suitable for inver- 
sion - composition is simply run backward, and the need to use piping functions 
makes the control flow explicit. It is true, however, point-free programs are some- 
times difficult to read. To aid understanding we will supply pointwise definition 
of complicated functions. The conversion between the point-free and pointwise 
style, however, will be dealt with loosely. 

The language Fun is not closed under converse - we can define non-injective 
functions in Fun, such as fst and snd, whose converses are not functional. In 
other words, Fun is powerful enough that it allows the programmer to define 
functions “unhealthy” under converse. In the next section we will further refine 
Fun into an injective language. 
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4 The Injective Language Inv 

In the previous section we defined a functional language Fun with a relational 
semantics. All constructs of Fun have relational interpretations and can thus 
be embedded in relations. In this section, we define a functional language Inv 
that allows only injective functions. All its constructs can be embedded in, and 
therefore Inv is strictly a subset of, Fun. 

The problematic constructs in Fun include constant functions, fst, snd , and 
the split. Constant functions and projections lose information. The split dupli- 
cates information and, as a result, in inversion we need to take care of consistency 
of previously copied data. We wish to enforce constrained use of these problem- 
atic constructs by introducing more structured constructs in Inv, in pretty much 
the same spirit how we enforced constrained use of intersection by introducing 
the split in Fun. The language Inv is defined by: 

I ::= 1° | C 

| eq P | dup P | neq S S 
| I; I | id 

| (/ x I) | assocr \ swap 

I (/u/) 

I lx) 

C ::= succ \ cons 

P ::= nil \ zero \ S 

S ::= C° | fst | snd \ id | 5; S 

Each construct in Inv has its inverse in Inv. Constructors cons and succ have 
inverses cons° and succ° . The function swap , now a primitive, is its own inverse. 
That is, swap 0 = swap. The function assocr has inverse assocl, whose definition 
will be given later. The inverse operator promotes into composition, product, 
union and fixed-point operator by the following rules: 

(f;9)° = 9 0 -,f° 

(/ X g)° = C f° X g°) 

(/ U g)° = f° U g° 

In the last equation F is a function from Inv expressions to Inv expressions, 
and the composition ; is lifted. One might instead write n{\X ■ (F I°)°) as the 
right-hand side. An extra restriction needs to be imposed on union. To preserve 
reversibility, in / U g we require not only the domains, but the ranges of / and g, 
to be disjoint. The disjointness may be checked by a type system, but we have 
not explored this possibility. 

The most interesting is the dup/ eq pair of operators. Each of them takes an 
extra functional argument which is either id, a constant function, or a sequence 
of composition of fsts, snds or constructors. They have types: 

dup :: (Fa — » a) — > Fa — > (Fa x a) 
eq :: (Fa — » a) — * (Fa x a) — » Fa 
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where F is some type functor. A call eqf ( x , a) tests whether the field in x 
selected by / equals a. Conversely, dup f x copies the selected field in x. They 
can be understood informally as 

dup f x = (x,f x) 

eqf (x, a) = x = f x = a 

That they are inverses of each other can be seen from their relational definition: 

dup f = fst° fl f;snd° 

eqf =fst n snd-J° 

The definition of dup f is similar to that in Section 2 apart from the presence of 
the argument /. Given the definitions it is clear that ( eqf)° = dup f and vice 
versa. 

According to the syntax, constant functions zero and nil can appear only as 
arguments to dup. For example, to introduce a fresh zero one has to call dup zero, 
which takes an input a and returns the pair (a,0). Therefore it is guaranteed 
that the input is not lost. An alternative design is to restrict the domain of 
zero and nil to the type Unit (therefore they do not lose information), while 
introducing a variation of the dup construct that creates fresh Unit values only. 
Considering their relational semantics, the two designs are interchangeable. For 
this paper we will mention only the first approach. 

Some more words on the design decision of the dup/ eq operators. Certainly 
the extra argument, if restricted to fst, snd and constructors, is only a syntactic 
sugar. We can always swap outside the element to be duplicated and use, for 
example (id x dup id). Nevertheless we find it quite convenient to have this 
argument. The natural extension to include constant functions unifies the two 
problematic elements for inversion, duplication and the constant function, into 
one language construct. For our future application about bi-directional editing, 
we further allow the argument to be any possibly non-injective functions (such 
as sum, map fst, etc), which turns out to be useful for specifying transformations 
from source to view. Allowing dup/ eq to take two arguments, however, does not 
seem to be necessary, as eq is meant to be asymmetrical - after a successful 
equality check, one of the checked values has to go away. It is in contrast to the 
neq operator to be introduced below. 

The neq pi P 2 operator, where pi and p 2 are projections defined by fst and 
snd, is a partial function checking for inequality. It is defined by 

neq p 1 p 2 (x, y) = (x, y) = pi x ^ p 2 y 

Otherwise (x, y) is not in its domain. The neqpi p 2 operator is its own inverse. 
It is sometimes necessary for ensuring the disjointness of the two branches of a 
union. 

Some more operators will be introduced in sections to come to deal with the 
sum type, trees, etc 3 . For now, these basic operators are enough for our purpose. 

3 Of course, Fun can be extended in the same way so these new operators can still be 
embedded in Fun. 
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Apparently every program in Inv is invertible, since no information is lost in any 
operation. Every operation has its inverse in Inv. The question, then, is can we 
actually define useful functions in this quite restrictive-looking language? 



5 Examples of Injective Functions in Inv 

In this section we give some examples of injective functions expressed in Inv. 



5.1 Piping Functions 

We loosely define “piping function” as functions that move around objects in 
pairs, copy them, or discard them, without checking their values - that is, “nat- 
ural” functions on pairs. We choose not to include assocl as a primitive because 
it can be defined in terms of other primitives - in several ways, in fact. One is 
as below: 

assocl = swap ; ( swap x id); assocr; (id x swap); swap 
Another is: 

assocl = swap; assocr; swap; assocr; swap 

Alternatively, one might wish to make assocl a primitive, so that inverting assocr 
does not increase the size of a program. 

These piping functions will turn out to be useful: 

subr (a, (b, c)) =(b,(a,c)) 

trans ((a, b), (c, d)) = ((a, c), (b, d)) 
distr (a, (b, c)) = ((a, b), (a, c)) 

The function subr substitutes the first component in a pair to the right, trans 
transposes a pair of pairs, while distr distributes a value into a pair. They have 
point-free definitions in Inv, shown below 4 : 

subr = assocl; (swap x id); assocr 
trans = assocr; (id x subr); assocl 
distr = (dup id x id); trans 

From the definitions it is immediate that subr and trans are their own inverses. 
The function distr, on the other hand, makes use of dup id to duplicate a before 
distribution. Its inverse, trans; (eqid x id) thus has to perform an equality check 
before joining the two as into one. 

In Section 6.1 we will talk about automatic construction of piping functions. 

4 Another definition, subr = swap; assocr; (id x swap), is shorter if assocl is not a 
primitive. 
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5.2 Patterns 

Patterns in Fun are written in terms of products, nil° , and cons° . For example, 
( cons° x id) decomposes the first component of a pair, while ( nil° x id) checks 
whether that component is the empty list. The latter is usually followed by a 
snd function to throw away the unit resulting from nil° . 

The term ( cons° x id) is still a legal and useful pattern in Inv. However, we 
do not have nil° in Inv. Part of the reason is that we do not want to have to 
introduce snd into the language, which allows the programmer to throw arbitrary 
information away. Instead, to match the pattern (ar, [ ]), we write eqnil , which 
throws [] away and keeps x. It is (id x nil°):fst packaged into one function. On 
the other hand, its inverse dup nil introduces a fresh empty list. 

Similarly, swap ; eq nil is equivalent to ( nil° x id)-, snd. We define 

nl = swap-, eq nil 
because we will use it later. 



5.3 Snoc, or Tail-Cons 

The function wrap :: A — > [T], wrapping the input into a singleton list, can be 
defined in Fun as wrap = (id, nil)-, cons. It also has a definition in Inv: 

wrap = dup nil ; cons 

Its converse is therefore wrap 0 = cons 0 -, eqnil - the input list is deconstructed 
and a nullity test is performed on the tail. 

The function snoc :: ([A], A) — > [A] appends an element to the right-end of a 
list. With wrap, we can define snoc in Fun as 

snoc = n(X: ( nil° x id); snd; wrap U 

( cons° x id); assocr; (id x X); cons) 

To define it in Inv, we notice that (nil° x id); snd is exactly nl defined above. 
Therefore we can rewrite snoc as: 

snoc = p,(X: nl; wrap U 

(cons° x id); assocr; (id x X); cons) 

Its inverse snoc° :: [A] — ■> ([A], A) extracts the last element from a list, if the 
list is non-empty. The first branch of snoc°, namely wrap 0 ; nl, extracts the only 
element from a singleton list, and pairs it with an empty list. The second branch, 
cons°; (id x snoc°); assocl; (cons x id), deconstructs the input list, processes the 
tail with snoc° , before assembling the result. The two branches have disjoint 
domains because the former takes only singleton lists while the second, due to 
the fact that snoc° takes only non-empty lists, accepts only lists with two or 
more elements. 
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5.4 Mirroring 

Consider the function mirror :: [.4] —4 [4] which takes a list and returns its 
mirror, for example, mirror [1, 2, 3] = [1, 2, 3, 3, 2, 1]. Assume that snoc and its 
inverse exists. Its definition in Fun is given as below: 

mirror = p(X: nil °; nil U 

cons°; ( fst , (snd,fst)); (id x (X x id); snoc); cons) 

To rewrite mirror in Inv, the tricky part is to convert (fst, ( snd,fst )) into some- 
thing equivalent in Inv. It turns out that (fst, ( snd,fst )) = dup fst; assocr. Also, 
since nil° is not available in Inv, we have to rewrite nil°; nil as dup id; eq nil. The 
function dup id duplicates the input, before eqnil checks whether the input is 
the empty list and eliminates the duplicated copy if the check succeeds. It will 
be discussed in Section 6.1 how such conversion can be done automatically. As 
a result we get: 

mirror = p(X: dup id; eq nil U 

cons°; dup fst; assocr; (id x (X x id); snoc); cons) 

It is educational to look at its inverse. By distributing the converse operator 
inside, we get: 

mirror 0 = p.(X: dup nil; eq id U 

cons°; (id x snoc°; (X x id)); assocl; eq fst; cons) 

Note that dup fst is inverted to an equality test eqfst. In the second branch, 
cons°; (id x snoc°; (X x id)) decomposes the given list into the head, the last 
element, and the list in-between. A recursive call then processes the list, before 
assocl; eqfst checks that the first and the last elements are equal. The whole 
expression fails if the check fails, thus mirror ° is a partial function. It is desirable 
to perform the equality check before making the recursive call. One can show, 
via algebraic reasoning, that the second branch equals 

cons°; (id x snoc°); assocl; eqfst; (id x X); cons 

which performs the check and rejects the illegal list earlier. This is one example 
showing that program inversion, even in this compositional style, is more than 
“running it backwards” . To construct a program with desirable behaviour it takes 
some more transformations - some are easier to be performed mechanically than 
others. 



5.5 Labelled Concatenation and Double Concatenation 

List concatenation is not injective. However, the following function Icat (labelled 
concatenation) 



Icat (a, (x, y)) = (a, x 4F [a] 4F y) 
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is injective if its domain is restricted to tuples where a does not appear in x. This 
way a acts as a marker telling us where to split x 41- y into two. Its point-free 
definition can be written as: 

Icat = nmemfst ; (fst, subr; { id x cons); cat) 

where nmem p (a, x) = (a, x) if a is not a member of the list p x. To show that 
it is injective, it is sufficient to show an alternative definition of Icat in Inv: 

Icat = n(X : ( dup id x nl); assocr ; {id x cons) U 
{id x {cons° x id); assocr); 
neq id fst; subr; {id x X); subr; {id x cons)) 

It takes a tedious but routine inductive proof, given in the appendix, to show 
that the two definitions are indeed equivalent. To aid understanding, the reader 
can translate it to its corresponding pointwise definition: 

lcat{a,{[],y)) = {a,a:y) 

Icat {a, {b : x, y )) = let {a' , xy) = Icat {a, {x, y)) 

in (a', b : xy) if a ^ b 

The presence of neq is necessary to guarantee the disjointness of the two branches 
- the first branch returns a list starting with a, while the second branch returns 
a list whose head is not a. 

Similarly, the following function dcat (for “double” concatenation) 

dcat (a, {{x, y), {u, v))) = (a, {x -H- y, u 4f [a] -H- v)) 

is injective if its domain is restricted to tuples where a does not appear in u, 
and x and u are equally long. Its point-free definition can be written as: 

dcat = pred; distr; {{id x cat) x subr; {id x cons); cat ) 

where pred is the predicate true of {a, {{x, y), {u, v))) where a is not in u and 
length x = length u. To show that it is injective, it is sufficient to show an alter- 
native definition of dcat in Inv: 

dcat = p<{X: {dup id x {nl x nl)); trans; {id x cons); assocr U 

{id x {{cons° x id) x {cons° x id))); neq id {snd; fst); 
pi; {id x X); subr; {id x trans; {cons x cons))) 

where pi is a piping function defined by 

pi = {id x {assocr x assocr); trans); subr 

such that pi {a,{{{b,x),y),{{c, u),v))) = {{b, c),{a,{{x,y),{u,v)))). The proof 
is similar to the proof for Icat and need not be spelt out in detail. To aid un- 
derstanding, the above dcat is actually equivalent to the following pointwise 
definition: 

dcat {a, (([], y), ([],*>))) = {a, (y, a : v)) 

dcat {a, {{b : x, y), {c : u, u))) = let {a' , {xy, uv)) = dcat {a, {{x, y), {u, v))) 

in (a', {b : xy, c : uv)) 
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5.6 Printing and Parsing XML Trees 

Consider internally labelled binary trees: 

data Tree A = Null \ Node A ( Tree A) ( Tree A) 

To deal with trees, we extend lnv with a new primitive node :: (A x (Tree A x 
Tree A)) — > Tree A, the curried variation of Node, and extend P with a new 
constant function null. An XML tree is basically a rose tree - a tree where each 
node has a list of children. A forest of rose trees, however, can be represented 
as a binary tree by the child-sibling representation: the left child of the a node 
in the binary tree represents the leftmost child of the corresponding XML node, 
while the right child represents its next sibling. For example, the XML tree in 
Figure 1(a) can be represented as the binary tree in Figure 1(b). 



<a> 

<b> <c></c> 
<d></d> 

</b> 

<eX/e> 

</a> 



Node a ( Node b ( Node c Null 

(Node d Null Null)) 
(Node e Null Null)) 

Null 



(a) 



(b) 



Fig. 1. Child-sibling representation of an XML tree. 



To print a binary tree to its conventional serial representation, on the other 
hand, one has to print an opening tag (for example <b>), its left subtree 
(<c></cxd></d>), a closing tag (</b>), and then print its right subtree 
(<e></e>). That is similar to what lent does! As a simplification, we define: 

serialise :: Tree A — > [A] 

serialise = n(X: dup nil ; swap ; eq null U 

node°\ (id x (X x A)); leat', cons) 

The function serialise takes a binary tree whose values in each node do not occur 
in the right subtree, and flattens the tree into a list. To deal with XML trees 
in general, we will have to return a list of opening/closing tags and check, in 
leat, that the tags are balanced. To perform the check, we have to maintain a 
stack of tags. For demonstration purpose, we deal with only the simpler case 
here. Its inverse, serialise 0 , parses a stream of labels back to an XML tree. By 
implementing printing, we get parsing for free! 

However, serialise ° is a quadratic-time parsing algorithm. The reason is that 
serialise , due to repeated calls to leat, is quadratic too, and the inverted program, 
without further optimising transformation, always has the same efficiency as the 
original one. To construct a linear-time parsing algorithm, we can try to con- 
struct a linear version of serialise. Alternatively, we can make use of well-known 
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program derivation techniques to construct an algorithm performing serialise 0 
in linear time. We define the function pparse (partial-parse) as below: 

pparse (a, x ) = (a, ( serialise ° y, z)) 
where y -W- [a] -W- z = x 

In point-free style it is written 

pparse = lcat°; ( id x ( serialise ° x id)) 

To come up with a recursive definition, we notice that the identity function id, 
when instantiated to take value ( a,x ) of type ( A x [A]), can be factored into 
id = (id x dup nil ; eq id) U (id x cons 0 ', cons), corresponds to the case x being 
empty or non-empty respectively. We start derivation with pparse = id' pparse 
and deal with each case separately. It will turn out that we need to further split 
the second case into two: 

id = (id x dup nil', eq id) U 

(id x cons 0 )', swap- eqfst; dupfst ; (cons x id)', swap U 
(id x cons °; cons); neqid ( cons;fst ); 

When x is non-empty, the head of x may equal a or not. We prefix pparse with 
each of the cases, and try to simplify it using algebraic rules. It is basically a 
case-analysis in point-free style. Some of the branches may turn out to yield an 
empty relation. 

We will demonstrate only the third branch. The derivation relies on the 
following associativity property: 

a : ((b : (x - H- [6] -H- y)) 4f [a] -H- z) = a : (b : x 41- [6] -H- (y -H- [a] 4f z)) 

The point-free counterpart of the property, however, looks much more cumber- 
some. We define: 

subrr = subr; (id x subr) 

subassoc = (id x assocr; (id x assocr)); subrr 

The associativity property can be rewritten as: 

(id x ( Icat ; cons x id)); Icat 

= subassoc; (id x (id x Icat)); subrr 0 ; (id x Icat; cons); neq id (cons° ; fst) (1) 

The neq check needs to be there; otherwise only the inclusion C holds. Given 
(1), derivation of pparse is long but trivial: 

neq id (cons° ; fst); pparse 
= {definition of pparse} 

neqid (cons° ; fst); lcat°; (id x ( serialise ° x id)) 

D {definition of serialise; we try its branches separately} 
neq id (cons°;fst); lcat°; 

(id x (cons°; lcat°; (id x ( serialise ° x serialise 0 )); node x id)) 
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= {products} 

neqid (cons° ; fst); lcat°; (id x ( cons °; lcat° x id)); 

(id x ((id x (serialise 0 x serialise 0 )); node x id)) 

= {by ( 1 ) and neg/ g; neqf g = ne?/ 5 } 

neqid (cons°;fst); (id x cons°; lcat ° ); subrr ; (id x (id x leal 0 )); subassoc °; 
(id x ((id x ( serialise 0 x serialise 0 )); node x id)) 

= {naturalty of sw&assoc} 

neqid (cons° ; fst); (id x cons°; lcat°); subrr; (idx 

( serialise 0 x lcat°; (id x ( serialise 0 x id)))); subassoc 0 ; (id x (node x id)) 
= {definition of pparse} 

neqid (cons°;fst); (id x cons°; lcat°); subrr; 

(id x ( serialise 0 x pparse)); subassoc °; (id x (node x id)) 

= {naturalty of subrr} 

neqid (cons°;fst); (id x cons°; lcat°; (id x ( serialise 0 x id))); subrr; 

(id x (id x pparse)); subassoc °; (id x (node x id)) 

= {definition of pparse} 

neqid (cons°;fst); (id x cons° ; pparse); subrr; 

(id x (id x pparse)); subassoc 0 ; (id x (node x id)) 

= {since neqf (cons°; g); (id x cons°) = (id x cons°); neqf g } 

(id x cons°); neq id fst; (id x pparse); subrr; 

(id x (id x pparse)); subassoc °; (id x (node x id)) 

After some derivation, and a careful check that the recursive equation does 
yield unique fixed-point (see [9]), one will come up with the following definition 
of pparse: 

pparse = p(X: (id x cons°); 

(swap; eqfst; (id x dup null; swap) U 

neqid fst; (id x X); subrr; (id x (id x A)); subassoc °; 

(id x (node x id)))) 

Now that we have pparse, we need to express serialise 0 in terms of pparse. 
Some derivation would show that: 

serialise 0 = p(X: dup nidi; swap; eq nil U 

cons°; pparse; (id x (id x A)); node) 

The point-free definition of pparse and serialise 0 might be rather confusing to 
the reader. To aid understanding, their pointwise definition is given in Figure 2. 



5.7 Loops 

An important feature is still missing in Inv we can not define loops. Loops 
come handy when we want to show that Inv is computationally as powerful as 
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pparse (a, a : x) = (a, ( Null,x )) 
pparse (a, b : x) = 

let ( b,(t,y )) = pparse (b,x) 
(a, (u, z)) = pparse (a, p) 
in (a, (Node b t u, z)) 



serialise 0 [] = Null 

serialise 0 (a : x) = 

let (a, (t, p)) = pparse (a, x) 

u = serialise 0 y 

in Node atu 



Fig. 2. Pointwise definition of pparse and serialise 0 . 



Bennett’s reversible Turing machine [4], since the simulation of a Turing machine 
is best described as a loop. In Fun, one can write a loop as a tail recursive function 
/r(X : term U body;X) where term and body have disjoint domains. However, the 
range of body;X contains that of term , which is not allowed in Inv when we 
ran the loop backwards we do not know whether to terminate the loop now or 
execute the body again. 

Tail recursion is allowed in [13], where they resolve the non-determinism 
in a way similar to how left-recursive grammars are dealt with in LR parsing. 
Alternatively, we could introduce a special construct for loops, for example, 
S; B*\ T, where the initialisation S and loop body B have disjoint ranges, while 
B and the terminating condition T have disjoint domains. In [8, 9], the conditions 
for a loop to terminate, as well as guidelines for designing terminating loops, were 
discussed in a similar style. One of the earliest case study of inverting loops is 
[7]. Construction and reasoning of invertible loops in general has been discussed 
in [1], 

Luckily, just to show that Inv is computationally equivalent to the reversible 
Turing machine, we do not need loops. One can code the reversible Turing ma- 
chine as a function which returns the final state of the tapes together with an 
integer counting the number iterations executed. The count can then be elim- 
inated in a clever way described by Bennett. More discussions will be given in 
Section 6.2. 



6 Translating Non-injective Functions 

Still, there are lots of things we cannot do in Inv. We cannot add two numbers, 
we cannot concatenate two lists. In short, we cannot construct non-injective 
functions. However, given a non-injective function p :: A — > B in Fun, we can 
always construct a pi :: A — > (B x H) in Inv such that pi\fst = p. In other 
words, p a = b if and only if there exists some h satisfying pj a = (b, h). 

Such a pj may not be unique, but always exists: you can always take H = A 
and simply copy the input to the output. However, it is not immediately obvious 
how to construct such a p j :: A — » (BxA) in Inv. Note that simply calling dup will 
not do, since not every function can be an argument to dup. Nor is it immediately 
obvious how to compose two transformed functions without throwing away the 
intermediate result. In Section 6.2, we will discuss the construction in more 
detail. 
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As another alternative, in Section 6.1 we will introduce what we call the 
“logging” translation, where a history of execution is recorded in H. The H 
resulting from the logging translation might not be the most interesting one, 
however. We can actually make pi do different things by designing different H . 
We will see such an example in Section 6.3. 

6.1 The Logging Translation 

In this section we describe the logging translation from Fun functions to Inv. It 
basically works by pairing the result of the computation together with a history, 
where each choice of branch and each disposed piece of data is recorded, so 
one can always trace the computation back. It is similar to a translation for 
procedural languages described in [28] . 



Logging the History. To represent the history, we introduce several new op- 
erators: unit , a constant function, like zero and nil, introduces the unit value (). 
Functions ini :: A A + B and inr :: B — ■> A + B wraps a value into a sum 
type. Finally, in builds recursive types. 



log :: Fun — > (Inv, Bool ) 

log succ — ( succ , F) 

log (f U g) = ( h (log f)-, (id x ini) U 

h { log g); {id x inr), T) 
where h (/, F) — f‘, dup unit 
h(f,T)=f 

log (/ X g) = case (logf, log g) of 
((/', F), (g‘ . F)) ->((/' x fl '),F) 
((/',T),(ff',F)) - ((/' x g'YJsub, T) 
((/', F), (g' , T)) — > ((/' x s'); assocl, T) 
((/': T), (s', T)) - ((/' x s'); trans. T) 
where Isub = assocr; {id X swap ); assocl 



log pF = {p{X: fst {log {F{S X))); {id X in)), T) 
log(Sf) = (/,T) 

- a placeholder for recursion 

log fs | unsafe — compose {pipe hd ) {log tl) 

| otherwise — compose {log hd) {log tl) 

— fs may be a split or a seq. of composition 
where ( hd , unsafe , tl) = f actor fs 



compose (/', F) (s', F) = ((/'; g'), F) 

compose (/', T) (s', F) = (/'; (s' x id ), T) 

compose (/', F) (s', T) = (/'; s'. T) 

compose {f r , T) {g' , T) — (/ 7 ; {g' x id); assocr , T) 



Fig. 3. The logging translation. 



The interesting fragments of the logging translation is summarised in Fig- 
ure 3. The function log translates a Fun function into Inv, while returning a 
boolean value indicating whether it carries history or not. Using the associativ- 
ity of composition and the following “split absorption” rule, 

(/ ; h, g\ k) = ( f,g);(h x k) 

we can factor a sequence of composition into a head and a tail. The head segment 
uses only splits, fst, snd , converses, and constant functions. The tail segment, 
on the other hand, does not start with any of the constructs. For example, 
(cons, fst) is factored into (id, fst)', ( cons x id), while nil 0 ', nil into ( nil° ; nil)', id. 
The factoring is done by the function factor. If the head does use one of the 




An Injective Language for Reversible Computation 305 



unsafe functions, it is compiled into Inv using the method to be discussed in the 
next section, implemented in the function pipe. 

An expression f;g , where / and g have been translated separately, is trans- 
lated into /; ( g x id); assocr if both / and g carry history. If / carries history 
but g does not, it is translated into f; (g x id). A product (fxj), where / and 
g both carry history, is translated into (/ x g); trans, where trans couples the 
histories together. If, for example, only g carries history, it is translated into 
(/ x g); assocl. 

A union / U g is translated into f; (id x ini) U g; (id x inr), where ini and inr 
guarantee that the ranges of the two branches are disjoint. We require both / 
and g to carry history. The branch not carrying history is postfixed with dup unit 
to create an empty history. Finally, the fixed-point p(X: F(X)) is translated to 
p(X : F(X);(id x in)). Here we enforce that recursive functions always carry 
history. Known injective functions can be dealt with separately as primitives. 

As an example, let us recall the function cat concatenating two lists. It is 
defined very similarly to snoc. The difference is that the two branches no longer 
have disjoint ranges. 

cat = p,(X: (nil° x id); snd U 

(cons° x id); assocr; (id x X); cons ) 

The logging translation converts cat into 

cati = fi(X: (nl; dup unit; (id x ini) U 

(cons° x id); assocr; (id x X); assocl; (cons x inr)); 

(id x in)) 

The function pipe, to be discussed in the next section, compiles expression (nil° x 
id); snd into swap; eg nil. The first branch, however, does not carry history since 
no non-constant data is thrown away. We therefore make a call to dup unit to 
create an initial history. In the second branch, the recursive call is assumed to 
carry history. We therefore shift the history to the right position by assocl. The 
two branches are distinguished by ini and inr. 

The history returned by cati is a sequence of inrs followed by ini - an 
encoding of natural numbers! It is the length of the first argument. In general, 
the history would be a tree reflecting the structure of the recursion. 



Compiling Piping Functions. Given a “natural” function defined in Fun in 
terms of splits, fst, snd, and constant functions, how does one find its equivalent, 
if any, in Inv? For mechanical construction, simple brute-force searching turned 
out to be satisfactory enough. 

Take, for example, the expression ((zero, snd; fst), (fst, snd; snd)). By a type 
inference on the expression, taking zero as a type of its own, we find that it 
transforms input of the form (A, (B, C)) to output ((0, B), (A, C)). We can then 
start a breadth-first search, where the root is the input type (A, (B, C)), the 
edges are all applicable Inv formed by primitive operations and products, and 
the goal is the target type ((0, B), (A, C)). To reduce the search space, we keep 
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a count of copies of data and constants to be created. In this example, we need 
to create a fresh zero, so dup zero needs to be called once (and only once). Since 
no input data is duplicated, other calls to dup are not necessary. One possible 
result returned by the search is swap ; assocr; ( dup zero x id); ( swap x swap). 

As another example, nil°; nil, taking empty lists to empty lists, can be com- 
piled into either dup nil; eq id or dup id; eq nil. Some extra care is needed to 
distinguish input (whose domain is to be restricted) and generated constants, 
such that id and dup id • eq id are not legitimate answers. The search space is 
finite, therefore the search is bound to terminate. 

When the Fun expression throws away some data, we compile it into an Inv 
expression that returns a pair whose first component is the output of the collects 
Fun expression. The forgotten bits are stored in the second component. For 
example, the Fun expression (fst, snd;fst) will be compiled into an Inv expression 
taking ( A , ( B , C)) to ((A, B ), C), where C is the bit of data that is left out in 
the Fun expression. One possibility is simply assocl. The left components of the 
compiled Inv expressions constitute the “history” of the computation. 



6.2 Relationship with the Reversible Turing Machine 

The logging translation constructs, from a function p :: A B , a function 
pi :: A — > (B x H ), where H records the history of computation. The question, 
then, is what to do with the history? Throwing H away merely delays the loss 
of information and dissipation of heat. Then answer was given by Bennett in [4]. 

The basic configuration of Bennett’s reversible Turing machine uses three 
tapes: one for input, one for output, and one for the history. Given a two-tape 
Turing machine accepting input A on the input tape and outputting B on the 
output tape, Bennett showed that one can always construct a three-tape re- 
versible Turing machine which reads the input A, and terminates with A and B 
on the input and output tapes, while leaving the history tape blank. This is how 
it is done: in the first phase, the program is run forward, consuming the data 
on the input tape while writing to the output and history tapes. The output is 
then copied. In the third phase the program is run backwards, this time con- 
suming the original output and history, while regenerating the input. This can 
be expressed in Inv by: 

pi; dup fst; ( pi° x id) :: A — > (A x B) 

We cannot entirely get rid of A, or some other sufficient information, if the 
computation is not injective. Otherwise we are losing information. When the 
computed function is injective, however, there is a way to erase both the history 
and input tapes empty. Bennett’s method to do it can be expressed in our 
notation as the following. Assume that there exists a q :: B —> A, defined in Fun, 
serving as the inverse of p. The logging translation thus yields qi :: B — > (A, H') 
in Inv. 



Pi; dup fst; ( pi° x id); swap; ( qi x id); eqfst; qi 




An Injective Language for Reversible Computation 307 



The prefix pj; dup fst; (pi° x id), given input a, computes ( a,b ). The pair is 
swapped and b is passed though qj, yielding ((a, h'), a). The duplicated a is 
removed by eqfst, and finally qi° takes the remaining ( a,h' ) and produces b. 

The above discussion is relevant to us for another reason: it helps to show 
that Inv, even without an explicit looping construct, is computationally at least 
as powerful as the reversible Turing machine. Let p be a Fun function, defined 
tail-recursively, simulating a reversible Turing machine. The types A and B both 
represent states of the machine and the contents of the three tapes. The function 
q simulates the reversed Turing machine. They can be translated, via the logging 
translation, into Inv as functions that returns the final state together with an 
extra counter. The counter can then be eliminated using the above technique. 

6.3 Preorder Traversal 

To see the effect of the logging translation, and its alternatives, let us look at 
another example. Preorder traversal for binary trees can be defined in Fun as: 

pre = p(X: null °; nil U 

node °; (id x (X x X); cati); cons) 

The logging translation delivers the following program in Inv: 

prei = p(X: ( dup nil ; eq null ; dup unit; (id x ini) U 

node 0 ; (id x (X x X); trans; (cati x id); assocr); (cons x inr)); 
(id x in)) 

which returns a tree as a history. In each node of the tree is a number, returned 
by cat i , telling us where to split the list into two. 

However, if we choose not to reply on the logging translation, we could have 
chosen H = [A] and defined: 

prein = p,(X: dup nil; eq mdl; dup nil U 

node 0 ; (id x (X x X); trans); dcat; assocl; (cons x id)) 

where dcat is as defined in Section 5.5. We recite its definition here: 

dcat (a, ((x, y ), ( u , v))) = (a, (x 4F y, u -H- [a] 4F v)) 

where a does not occur in x, and x and u have the same lengths. Since (id x 
cat); cons = dcat; assocl; (cons x id); fst, it is obvious that prein; fst reduces to 
pre - with a restriction on its domain. The partial function prein accepts only 
trees with no duplicated labels. What about prein; sndl It reduces to inorder 
traversal of a binary tree. It is a known fact: we can reconstruct a binary tree 
having no duplicated labels from its preorder and inorder traversals [23]. 

7 Implementation 

We have a prototype implementation of the logging translation and a simple, 
back-tracking interpreter for Inv, both written in Haskell. The implementation of 
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the logging translation, though tedious, poses no substantial difficulty. Producing 
an efficient, non-backtracking Inv interpreter, however, turned out to be more 
tricky than we expected. 

Implementing a relational programming language has been discussed, for ex- 
ample, in [15] and [20]. Both considered implementing an executable subset of 
Ruby, a relational language for designing circuits [16]. A functional logic pro- 
gramming language was used for the implementation, which uses backtracking 
to collect all results of a relational program. 

The problem we are dealing with here, however, is a different one. We know 
that legitimate programs in Inv are deterministic. But can we implement the 
language without the use of backtracking? Can we detect the error when the 
domains or ranges of the two branches of a union are not disjoint? Consider 
snoc°, with the definition of wrap and nl expanded: 

snoc° = p(X : cons 0 ', eq nil ; chip nil', swap U 

cons 0 ', ( id x X)\ assoch, ( cons x id)) 

Both branches start with cons° , and we cannot immediately decide which branch 
we should take. 

A natural direction to go is to apply some form of domain analysis. Gluck 
and Kawabe [13] recently suggested another approach. They observed that the 
problem is similar to parsing. A program is like a grammar, where the traces 
of a program constitute its language. Determining which branch to go is like 
determining which production rule to use to parse the input. In [13] they adopted 
the techniques used in LR parsing, such as building item sets and automatons, 
to construct non-backtracking programs. It is interesting to see whether the 
handling of parsing conflicts can be adapted to detect, report and resolve the 
non-disjointness of branches. 



8 Conclusion and Related Work 

We have presented a language, Inv, in which all functions definable are injective. 
It is a functional language with a relational semantics. Through examples, we find 
that many useful functions can be defined in Inv. In fact, it is computationally 
equivalent to Bennett’s reversible Turing machines. Non-injective functions can 
be simulated in Inv via the logging translation which converts it to an Inv function 
returning both the result and a history. 

A lot of previous work has been devoted into the design of programming lan- 
guages and models for reversible computation. To the best of the authors’ knowl- 
edge, they include Baker’s PsiLisp [3], Lutz and Derby’s Janus [19], Frank’s R 
[10], and Zuliani’s model based on probabilistic guarded-command language [28]. 
Our work differs from the previous ones in several aspects: we base our model 
on a functional language; we do not rely on “hidden” features of the machine 
to record the history; and we highlight the importance of program derivation as 
well as mechanical inversion. We believe that Inv serves as a clean yet expressive 
model for the construction and reasoning of reversible programs. 
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The motivation to study languages for reversible programs traditionally comes 
from the thermodynamics view of computation. We were motivated for yet an- 
other reason - to build bi-directional editors. The source data is transformed 
to a view. The user can then edit the view, and the system has to work out 
how the source shall be updated correspondingly. Meertens [21] and Greenwald 
and Moore, et al. [14] independently developed their combinators for describing 
the source-to-view transformation, yet their results are strikingly similar. Both 
are combinator-like, functional languages allowing relatively unrestricted use of 
non-injective functions. Transformations are surjective functions/relations, and 
view-to-source updating is modelled by a function taking both the old source 
and the new view as arguments. Things get complicated when duplication is 
involved. We are currently exploring a slightly different approach, basing on Inv, 
in which the transformation is injective by default, and the use of non-injective 
functions is restricted to the dup operator. We hope this would make the model 
clearer, so that the difficulties can be better tackled. 

Our work is complementary to Gliick and Kawabe’s recent work on automatic 
program inversion. While their focus was on automatic inversion of programs 
and ours on theory and language design, many of their results turned out to be 
highly relevant to our work. The stack-based intermediate language defined in 
[12] is actually an injective language. They also provided sufficient, though not 
necessary, conditions for the range-disjointness of branches. For a more precise 
check of disjointness they resort to the LR parsing technique [13]. Their insight 
that determining the choice of branches is like LR parsing is the key to build an 
efficient implementation of Inv. The authors are interested to see how conflict- 
handling can help to resolve the disjointness of branches. 
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A Proof for the Labelled Concatenation 

Let Icat be defined by 

Icat = nmemfst ■ ( fst , id); (id x subr ; (id x cons); cat) 

where nmemp ( a,x ) = (a,x) if a is not a member of the list px, and Icati be 
the fixed-point of IcatF, where 

IcatF X = (dup id x nl); assocr; (id x cons) U 
(id x (cons° x id); assocr); 
neqidfst; subr; (id x X ); subr; (id x cons)) 

The aim is to show that Icat is also a fixed-point of IcatF. Starting with the 
more complicated branch of IcatF , we reason 

(id x (cons° x id); assocr); neq id fst; subr; (id x Icat); subr; (id x cons) 

= {definition of Icat} 

(id x (cons° x id); assocr ); neq id fst; subr; 

(id x nmemfst; (fst, id); (id x subr; (id x cons); cat)); subr; (id x cons) 

= {since (id x (id x R )); subr = subr; (id x (id x R))} 

(id x (cons° x id); assocr ); neq id fst; subr; (idx 

nmemfst; (fst, id); (id x subr)); subr; (id x (id x (id x cons); cat); cons ) 
= {since subr; (id x nmemfst ) = nmem (snd;fst); subr} 

(id x (cons° x id); assocr); neq id fst; nmem (snd; fst); 

subr; (id x (fst, id); (id x subr)); subr; (id x (id x (id x cons); cat); cons ) 

= {expressing the piping in terms of splits} 

(id x (cons° x id); assocr ); neq id fst; nmem (snd; fst); 

(fst, (snd; fst, (snd; snd; fst, (fst, snd; snd; snd)))); 

(id x (id x (id x cons); cat); cons ) 

= {associativity: (id x cat); cat = assocl; (cons x id); cat, 

and naturalty: (id x (id x R)); assocl = assocl; (id x R)} 
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{id x {cons° x id ); assocr ); neq idfst ; nmem ( snd; fst ); 

(fst, { snd; fst , {snd; snd; fst , (/si, snd; snd; snd)))); 

{id x assocl)\ {id x (cons x cons)', cat) 

= {move {id x assocr) rightwards} 

(zd x {cons° x zd)); neg id {fst- fst)', nmem {fst; snd); {id x assocr) 

{fst, {snd; fst, {snd; snd; fst, {fst, snd; snd; snd)))); 

{id x assort); {id x {cons x cons); cat) 

= { cancelling assocl and assocr with splits} 

{id x {cons° x id)); neq id {fst; fst); nmem {fst; snd); 

{fst, {snd; fst, {fst, snd; snd))); {id x {cons x cons); cat) 

= {split absorption} 

{id x {cons° x id)); neq id {fst; fst); nmem {fst; snd); 

{fst, {snd; fst; cons, {fst, snd; snd))); {id x {id x cons); cat ) 

= {products} 

{id x {cons° x id)); neq id {fst; fst); nmem {fst; snd); 

{id x {cons x id)); {fst, {snd; fst, {fst, snd; snd))); {id x {id x cons); cat) 

= {since neq id {fst; fst); nmem {fst; snd); {id x {cons x id)) 

= {id x {cons x id)); nmem fst} 

{id x {cons°; cons x id)); nmem fst; 

{fst, {snd; fst, {fst, snd; snd))); {id x {id x cons); cat) 

= {products} 

{id x {cons°; cons x id)); nmem fst; {fst, id); {id x subr); 

{id x {id x cons); cat) 

= {folding Icat,} 

{id x {cons°; cons x id)); Icat 

For the other branch we reason: 

{id x {nil°; nil x id)); Icat 

= {definition of Icat and {id x {nil x id)); nmem fst = {id x {nil x id))} 
{id x {nil°; nil x id)); {fst, id); {id x subr; {id x cons); cat) 

= {since h; (/ , g) = (. h;f , h; g) for total h, f,g } 

{id x {nil° x id)); {fst, {id x {nil x id))); {id x subr; {id x cons); cat) 

= {split absorption} 

{id x {nil° x id)); {fst, {id x {nil x id)); subr; {id x cons); cat) 

= {since (/ x {g x h)); subr = subr; {g x (/ x h))} 

{id x {nil° x id)); {fst, subr; {nil x cons); cat) 

= {since {nil x id); cat = snd} 

{id x {nil° x id)); {fst, subr; {id x cons); snd) 

= {since {id x /); snd = snd;f for total /} 
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(id x ( nil° x id))' ( fst , subr ; snd; cons) 

= {split absorption, backwards} 

{id x {nil° x id)); {fst, subr; snd ); (id x cons) 

= {piping} 

(dap id x swap; eg nil); assocr; {id x cons) 

Therefore we conclude that 
IcatF Icat 

= {with the reasoning above} 

(id x {cons°; cons x id)); Icat U (id x (nii°; nii x id)); Icat) 
= {composition distributes into union} 

((id x {cons°; cons x id)) U {id x {nil°; nil x id))); Icat 
= {since cons°; cons U nil°; nil = id} 

Icat 
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Abstract. Generic Programming deals with the construction of pro- 
grams that can be applied to many different datatypes. This is achieved 
by parameterizing the generic programs by the structure of the datatypes 
on which they are to be applied. Programs that can be defined generi- 
cally range from simple map functions through pretty printers to complex 
XML tools. 

The design space of generic programming languages is largely unex- 
plored, partly due to the time and effort required to implement such 
a language. In this paper we show how to write flexible prototype imple- 
mentations of two existing generic programming languages, PolyP and 
Generic Haskell, using Template Haskell, an extension to Haskell that 
enables compile-time meta-programming. In doing this we also gain a 
better understanding of the differences and similarities between the two 
languages. 



1 Introduction 

Generic functional programming [9] aims to ease the burden of the programmer 
by allowing common functions to be defined once and for all, instead of once 
for each datatype. Classic examples are small functions like maps and folds [8], 
but also more complex functions, like parsers and pretty printers [11] and tools 
for editing and compressing XML documents [5], can be defined generically. 
There exist a number of languages for writing generic functional programs [1,3, 
6,7,10,12,13], each of which has its strengths and weaknesses, and researchers 
in generic programming are still searching for The Right Way. Implementing a 
generic programming language is no small task, which makes it cumbersome to 
experiment with new designs. 

In this paper we show how to use Template Haskell [16] to implement two 
generic programming extensions to Haskell: PolyP [10, 15] and Generic Haskell 
[6] . With this approach, generic functions are written in Haskell (with the Tem- 
plate Haskell extension), so there is no need for an external tool. Furthermore 
the support for code generation and manipulation in Template Haskell greatly 
simplifies the compilation of generic functions, thus making the implementations 

* This work is partially funded by the Swedish Foundation for Strategic Research as 
part of the research program “Cover - Combining Verification Methods in Software 
Development”. 
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very lightweight and easy to experiment with. Disadvantages of this approach 
are that we do not get the nice syntax we can get with a custom made parser 
and because we have not implemented a type system, generic functions are only 
type checked at instantiation time. 

The rest of this section gives a brief introduction to Template Haskell, PolyP 
and Generic Haskell (GH). Section 2 compares PolyP and GH. Section 3 intro- 
duces the concepts involved in implementing generic programming using Tem- 
plate Haskell. Sections 4 and 5 outline the implementations of PolyP and GH 
and Section 6 points to possible future work. 

1.1 Template Haskell 

Template Haskell [16] is a language extension implemented in the Glasgow 
Haskell Compiler that enables compile-time meta-programming. This means that 
we can define code generating functions that are run at compile-time. In short 
you can splice abstract syntax into your program using the $(...) notation and 
lift an expression to the abstract syntax level using the quasi-quotes [ I . . . I ] . 
Splices and quasi-quotes can be nested arbitrarily deep. For example, it is pos- 
sible to define the printf function with the following type: 

printf : : String -> Q Exp 

Here printf takes the format string as an argument and produces the abstract 
syntax for the printf function specialized to that particular format string. To 
use this function we can write, for instance 

Main> $(printf "Hello °/ 0 s, number 7 0 d!") "World" 100 
"Hello World, number 100!" 

Template Haskell comes with libraries for manipulating the abstract syntax of 
Haskell. The result type Q Exp of the printf function models the abstract syntax 
of an expression. The type constructor Q is the quotation monad, that takes care 
of, for instance, fresh name generation and the Exp type is a normal Haskell 
datatype modeling Haskell expressions. Similar types exist for declarations (Dec) 
and types (Type). The quotation monad is built on top of the 10 monad, so if we 
want to escape it we have to use the function unsaf ePerf ormlO : : 10 a -> a. 

The definition of printf might look a bit complicated with all the lifts and 
splices, but ignoring those we have precisely what we would have written in an 
untyped language. 

printf : : String -> Q Exp 
printf fmt = prAcc fmt [| "" |] 
where 

prAcc : : String -> Q Exp -> Q Exp 
prAcc fmt r = case fmt of 

’7o’:’d’:f -> [| \n -> $ (prAcc f [| $r ++ show n |]) |] 

’s’ :f -> [| \s -> $ (prAcc f [ I $r ++ s ID I] 

c:f -> prAcc f [| $r ++ [c] |] 

-> r 



II II 
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The prAcc function uses an accumulating parameter r containing (the abstract 
syntax of) an expression representing the string created so far. Every time we 
see a 7« code we add a lambda at the top level and update the parameter with 
the argument. 

The keen observer will note that this definition of printf is quadratic in the 
length of the format string. This is easy to fix but for the sake of brevity we 
chose the inefficient version, which is slightly shorter. 

Template Haskell supports program reflection or reification, which means 
that it is possible to get hold of the type of a named function or the declaration 
that defines a particular entity. For example: 

reifyType id : : Q Type 
reifyDecl Maybe : : Q Dec 

We can use this feature to find the definitions of the datatypes that a generic 
function is applied to. 

1.2 PolyP 

PolyP [10,15] is a language extension to Haskell for generic programming, that 
allows generic functions over unary regular datatypes. A regular datatype is a 
datatype with no function spaces, no mutual recursion and no nested recursion 1 . 
Examples of unary regular datatypes are [] , Maybe and Rose: 

data Rose a = Fork a [Rose a] 

Generic programming in PolyP is based on the notion of pattern functors. 
Each datatype is associated with a pattern functor that describes the structure 
of the datatype. The different pattern functors are shown in Figure 1. The ( : + : ) 
pattern functor is used to model multiple constructors, ( : * : ) and Empty model 
the list of arguments to the constructors, Par is a reference to the parameter 
type, Rec is a recursive call, (:@:) models an application of another regular 
datatype and Const is a constant type. The pattern functors of the datatypes 
mentioned above are (the comments show the expanded definitions applied to 
two type variables p and r): 

type ListF = Empty :+: (Par :*: Rec) -- Either () (p,r) 

type MaybeF = Empty :+: Par -- Either () p 

type RoseF = Par :*: ( [] :@: Rec) -- (p, [r] ) 

PolyP provides two functions inn and out to fold and unfold the top-level 
structure of a datatype. Informally, for any regular datatype D with pattern 
functor F, inn and out have the following types: 

inn : : F a (D a) -> D a 
out : : D a -> F a (D a) 

Note that only the top-level structure is folded/unfolded. 

1 The recursive calls must have the same form as the left hand side of the definition. 
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type (g :+: h) p r 
type (g :*: h) p r 
type Empty p r 
type Par p r 
type Rec p r 
type (d g) p r 
type Const t p r 



= Either (g p r) 
= (g p r, h p r) 
= 0 

= P 
= r 

= d (g p r) 

= t 



(h p r) 



Fig. 1. Pattern functors 



A special construct, polytypic, is used to define generic functions over pat- 
tern functors by pattern matching on the functor structure. As an example, the 
definition of fmap2, a generic map function over pattern functors, is shown in 
Figure 2. Together with inn and out these polytypic functions can be used to 
define generic functions over regular datatypes. For instance: 

pmap : : (a -> b) -> D a -> D b 
pmap f = inn . fmap2 f (pmap f) . out 

The same polytypic function can be used to create several different generic func- 
tions. We can, for instance, use fmap2 to define generic cata- and anamorphisms 
(generalized folds and unfolds): 

cata : : (F a b -> b) -> D a -> b 
cata f = f . fmap2 id (cata f) . out 

ana : : (b -> F a b) -> b -> D a 
ana f = inn . fmap2 id (ana f) . f 



1.3 Generic Haskell 

Generic Haskell [2] is an extension to Haskell that allows generic functions over 
datatypes of arbitrary kinds. Hinze [4] observed that the type of a generic func- 
tion depends on the kind of the datatype it is applied to, hence each generic 
function in Generic Haskell comes with a generic (kind indexed) type. The kind 
indexed type associated with the generic map function is defined as follows: 

Map {[*]} s t = s -> t 

Map {[k->l]}st = forall a b. Map {[k]}ab-> 

Map {[ 1 ]> (s a) (t b) 

Generic Haskell uses the funny brackets ({ [ ] }) to enclose kind arguments. The 
type of the generic map function gmap applied to a type t of kind k can be 
expressed as 

gmap { I t : : k I > : : Map { [ k ] > t t 
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polytypic fmap2 : : (a -> c) -> (b -> d) -> f a b -> f c d 
= \p r -> case f of 

g :+: h -> fmap2 p r -+- fmap2 p r 

g h -> fmap2 p r fmap2 p r 

Empty -> const () 

Par -> p 

Rec -> r 

d g -> pmap (fmap2 p r) 

Const t -> id 

(-+-) : : (a -> c) -> (b -> d) -> Either a b -> Either c d 
(f -+- g) (Left x) = Left (f x) 

(f -+- g) (Right y) = Right (g y) 

(-*-) : : (a -> c) -> (b -> d) -> (a,b) -> (c,d) 

(f g) (x,y) = (f x, g y) 

Fig. 2. The definition of fmap2 in PolyP 



The second type of funny brackets ({ I I }) encloses type arguments. Following 
are the types of gmap for some standard datatypes. 

gmap { I Bool 1} : : Bool -> Bool 

gmap { I [] I } : : forall a b. (a -> b) -> [a] -> [b] 

gmap { I Either 1} : : forall a b. (a -> b) -> 

forall c d. (c -> d) -> 

Either a c -> Either b d 

The kind indexed types follow the same pattern for all generic functions. A 
generic function applied to a type of kind n — » v is a function that takes a 
generic function for types of kind k and produces a generic function for the 
target type of kind v. 

The generic functions in Generic Haskell are defined by pattern matching on 
the top-level structure of the type argument. Figure 3 shows the definition of the 
generic map function gmap. The structure combinators are similar to those in 
PolyP. Sums and products are encoded by : + : and : * : and the empty product 
is called Unit. A difference from PolyP is that constructors and record labels are 
represented by the structure combinators Con c and Label 1. The arguments 
(c and l) contain information such as the name and fixity of the constructor 
or label. A generic function must also contain cases for primitive types such as 
Int. The type of the right hand side of each clause is the type of the generic 
function instantiated with the structure type on the left. The definitions of the 
structure types are shown in Figure 4. Note that the arguments to Con and 
Label containing the name and fixity information are only visible in the pattern 
matching and not in the actual types. 

Generic Haskell contains many features that we do not cover here, such as 
type indexed types, generic abstraction and constructor cases. 
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gmap { | t : : k | > : : Map { [ k ] } t t 

gmap { | : + : I } gmapA gmapB (Ini a) 

gmap {| :+: 1} gmapA gmapB (Inr b) 

gmap { | : * : I } gmapA gmapB (a : * : b) 

gmap { | Unit I > 

gmap { | Con c I } gmapA (Con a) 
gmap { | Label 1 I } gmapA (Label a) 
gmap { | Int I } 



Ini (gmapA a) 

Inr (gmapB b) 
gmapA a : * : gmapB b 
id 

Con (gmapA a) 
Label (gmapA a) 
id 



Fig. 3. A generic map function in Generic Haskell 



data a : + : b = Ini a I Inr b 

data a : * : b = a : * : b 

data Unit = Unit 

data Con a = Con a 

data Label a = Label a 



Fig. 4. Structure types in Generic Haskell 



2 Comparing PolyP and Generic Haskell 

The most notable difference between PolyP and Generic Haskell is the set of 
datatypes available for generic programmers. In PolyP generic functions can only 
be defined over unary regular datatypes, while Generic Haskell allows generic 
functions over (potentially non-regular) datatypes of arbitrary kinds. There is a 
trade-off here, in that more datatypes means fewer generic functions. In PolyP it 
is possible to define generic folds and unfolds such as cat a and ana that cannot 
be defined in Generic Haskell. 

Even if PolyP and Generic Haskell may seem very different, their approaches 
to generic programming are very similar. In both languages generic functions are 
defined, not over the datatypes themselves, but over a structure type acquired 
by unfolding the top-level structure of the datatype. The structure types in 
PolyP and Generic Haskell are very similar. The differences are that in PolyP 
constructors and labels are not recorded explicitly in the structure type and the 
structure type is parameterized over recursive occurrences of the datatype. This 
is made possible by only allowing regular datatypes. For instance, the structure 
of the list datatype in the two languages is (with Generic Haskell’s sums and 
products translated into Either, (,) and ()): 

type ListF a r = Either () (a, r ) -- PolyP 

type ListS a = Either (Con ()) (Con (a, [a])) -- GH 

To transform a generic function over a structure type into a generic function 
over the actual datatype, conversion functions between the datatype and the 
structure type are needed. In PolyP they are called inn and out (described 

in Section 1.2) and they are primitives in the language. In Generic Haskell this 
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conversion is done by the compiler and the conversion functions are not available 
to the programmer. 

As mentioned above, generic functions in both languages are primarily de- 
fined over the structure types. This is done by pattern matching on a type code, 
representing the structure of the datatype. The type codes differ between the 
languages, because they model different sets of datatypes, but the generic func- 
tions are defined in very much the same way. The most significant difference 
is that in Generic Haskell the translations of type abstraction, type application 
and type variables are fixed and cannot be changed by the programmer. 

Given a generic function over a structure type it should be possible to con- 
struct a generic function over the corresponding datatype. In Generic Haskell this 
process is fully automated and hidden from the programmer. In PolyP, however, 
it is the programmer’s responsibility to take care of this. One reason for this is 
that the structure types are more flexible in PolyP, since they are parameterized 
over the recursive occurrences of the datatype. This means that there is not a 
unique datatype generic function for each structure type generic function. For 
instance the structure type generic function fmap2 from Figure 2 can be used 
not only to define the generic map function, pmap, but also the generic cata- and 
anamorphisms, cata and ana. 

3 Generic Programming in Template Haskell 

Generic functions in both PolyP and Generic Haskell are defined by pattern 
matching over the code for a datatype. Such a generic function can be viewed 
as an algorithm for constructing a Haskell function given a datatype code. For 
instance, given the type code for the list datatype a generic map function can 
generate the definition of a map function over lists. Program constructing al- 
gorithms like this can be implemented nicely in Template Haskell; a generic 
function is simply a function from a type code to the abstract syntax for the 
function specialized to the corresponding type. When embedding a generic pro- 
gramming language like PolyP or Generic Haskell in Template Haskell there are 
a few things to consider: 

— Datatype codes 

The structure of a datatype has to be coded in a suitable way. How this is 
done depends, of course, on the set of datatypes to be represented, but we 
also have to take into account how the type codes affect the generic function 
definitions. Since we are going to pattern match on the type codes we want 
them to be as simple as possible. 

To avoid having to create the datatype codes by hand we can define a (par- 
tial) function from the abstract syntax of a datatype definition to a type 
code. The (abstract syntax of the) datatype definition can be acquired using 
the reification facilities in Template Haskell. 

Structure types 

To avoid having to manipulate datatype elements directly, generic functions 
are defined over a structure type, instead of over the datatype itself. The 
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structure type reveals the top-level structure of the datatype and allows us 
to manipulate datatype elements in a uniform way. Both PolyP and Generic 
Haskell use a binary sum type to model multiple constructors and a binary 
product to model the arguments to the constructors. In this paper we use 
Either as the sum type and ( , ) as the product type. 

When applied to the code for a datatype a generic function produces a 
function specialized for the structure type of that datatype. This means that 
we have to know how to translate between type codes and structure types. 
For instance, the type code Par in PolyP is translated to the structure type 
with the same name. Note that while there is a unique structure type for 
each type code, it is possible for several datatypes to have the same code 
(and thus the same structure). 

Generic function definitions 

A generic function is implemented as a function from a type code to (abstract 
syntax for) the specialized version of the function. It might also be necessary 
to represent the type of a generic function in some way as we will see when 
implementing Generic Haskell in Section 5. 

From structure types to datatypes 

The generic functions defined as described above produce functions special- 
ized for the structure types. What we are interested in, on the other hand, 
are specializations for the actual datatypes. As described in Section 2, PolyP 
and Generic Haskell take two different approaches to constructing these spe- 
cializations. In PolyP it is the responsibility of the user whereas in Generic 
Haskell, it is done by the compiler. In any case we need to be able to convert 
between an element of a datatype and an element of the corresponding struc- 
ture type. How difficult the conversion functions are to generate depends on 
the complexity of the structure types. Both PolyP and Generic Haskell have 
quite simple structure types, so the only information we need to generate the 
conversion functions is the names and arities of the datatype constructors. 
In the approach taken by PolyP, the conversion functions (inn and out) 
are all the compiler needs to define. The programmer of a generic function 
will then use these to lift her function from the structure type level to the 
datatype level. Implementing the Generic Haskell approach on the other 
hand requires some more machinery. For each generic function, the compiler 
must convert the specialization for a structure type into a function that 
operates on the corresponding datatype. In Section 5 we will see how to do 
this for Generic Haskell. 

Instantiation 

Both the PolyP and Generic Haskell compilers do selective specialization, 
that is, generic functions are only specialized to the datatypes on which they 
are actually used in the program. This requires traversing the entire program 
to look for uses of generic functions. When embedding generic programming 
in Template Haskell we do not want to analyze the entire program to find 
out which specializations to construct. What we can do instead is to in- 
line the body of the specialized generic function every time it is used. This 
makes the use of the generic functions easy, but special care has to be taken 
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to avoid that recursive generic functions give rise to infinite specializations. 
This is the approach we use when embedding PolyP in Template Haskell 
(Section 4). Another option is to require the user to decide which functions 
to specialize on which datatypes. This makes it harder on the user, but a 
little easier for the implementor of the generic programming language. Since 
our focus is on fast prototyping of generic languages, this is the method we 
choose when implementing Generic Haskell. 

4 PolyP in Template Haskell 

Following the guidelines described in Section 3 we can start to implement our 
first generic programming language, PolyP. 



4.1 Datatype Codes 

The first thing to do is to decide on a datatype encoding. In PolyP, generic 
functions are defined by pattern matching on a pattern functor, so to get a 
faithful implementation of PolyP we should choose the type code to model these 
pattern functors. 

data Code = Code :+: Code I Code Code I Empty 

I Par I Rec I Regular Code I Const Type 

This coding corresponds perfectly to the definition of the pattern functors in 
Figure 1, we just have to decide what Type and Regular mean. The Template 
Haskell libraries define the abstract syntax for Haskell types in a datatype called 
Type so this is a natural choice to model types. The model of a regular datatype 
should contain a code for the corresponding pattern functor, but it also needs to 
contain the constructor names of the regular datatype. This is because we need 
to generate the inn and out functions for a regular datatype. Consequently we 
choose the following representation of a regular datatype: 

type Regular = ( [ConName] , Code) 

To make it easy to get hold of the code for a datatype, we want to define 
a function that converts from the (abstract syntax of a) datatype definition to 
Regular. A problem with this is that one regular datatype might depend on 
another regular datatype, in which case we have to look at the definition of the 
second datatype as well. So instead of just taking the definition of the datatype 
in question our conversion function takes a list of all definitions that might be 
needed together with the name of the type to be coded. 

regular : : [Q Dec] -> TypeName -> Regular 

Note that if regular fails, which it will if a required datatype definition is missing 
or the datatype is not regular, we will get a compile time error, since the function 
is executed by the Template Haskell system at compile time. 
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We can use the function regular to get the code for the Rose datatype 
defined in Section 1.2 as follows: 

roseD = regular [reifyDecl Rose, reifyDecl [] ] "Rose" 



4.2 Structure Types 

The structure type of a regular datatype is a pattern functor. See Section 1.2 and 
Figure 1 in particular for their definitions. The mapping between the datatype 
codes (Code) and the pattern functors is the obvious one (a type code maps to 
the pattern functor with the same name). 



4.3 Generic Function Definitions 

Generic functions over pattern functors are implemented as functions from type 
codes to (abstract) Haskell code. For example, the function fmap2 from Figure 2 
in Section 1.2 is implemented as shown in Figure 5. The two definitions are 
strikingly similar, but there are a few important differences, the most obvious one 
being the splices and quasi-quote brackets introduced in the Template Haskell 
definition. Another difference is in the type signature. PolyP has its own type 
system capable of expressing the types of generic functions, but in Template 
Haskell everything inside quasi-quotes has type Q Exp, and thus the type of 
fmap2 is lost. The third difference is that in Template Haskell we have to pass 
the type codes explicitly. 



fmap2 : : Code -> Q Exp 
fmap2 f = 

[| \p r -> $( 

case f of 

g : + : h -> [| 
g :*: h -> [| 
Empty -> [ | 
Par -> [ | 

Rec -> [| 

d :0: g -> [| 
Const t -> [ | 

) 

11 



$(fmap2 g) p r -+- $(fmap2 h) p r |] 
$(fmap2 g) p r $(fmap2 h) p r |] 
const () |] 

P I] 
r I ] 

$ (pmap d) ($ (fmap2 g) p r) |] 
id |] 



Fig. 5. fmap2 in Template Haskell 



The (:@:)-case in the definition of fmap2 uses the datatype level function 
pmap to map over the regular datatype d. The definition of pmap is described in 
Section 4.4. 
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4.4 From Structure Types to Datatypes 

The generic functions described in Section 4.3 are defined over pattern functors, 
whereas the functions we are (really) interested in operate on regular datatypes. 
Somehow we must bridge this gap. In PolyP these datatype level functions are 
defined in terms of the pattern functor functions and the functions inn and out, 
that fold and unfold the top-level structure of a datatype. In PolyP, inn and out 
are primitive but in our setting they can be treated just like any generic function, 
that is, they take a code for a datatype and produce Haskell code. However, the 
code for a pattern functor is not sufficient, we also need to know the names of 
the constructors of the datatype in order to construct and deconstruct values of 
the datatype. This gives us the following types for inn and out. 

inn, out : : Regular -> Q Exp 

To see what code has to be generated we can look at the definition of inn and 
out for lists: 

out_List :: [a] -> Either () (a, [a]) 
out_List = \xs -> case xs of 
[] -> Left () 

x:xs -> Right (x,xs) 

inn_List :: Either () (a, [a]) -> [a] 
inn_List = \xs -> case xs of 
Left () -> [] 

Right (x,xs) -> x:xs 

Basically we have to generate a case expression with one branch for each con- 
structor. In the case of out we match on the constructor and construct a value 
of the pattern functor whereas inn matches on the pattern functor and creates 
a value of the datatype. Note that the arguments to the constructors are left 
untouched, in particular the tail of the list is not unfolded. 

With inn and out at our disposal we define the generic map function over 
a regular datatype, pmap. The definition is shown in Figure 6 together with the 
same definition in PolyP. In PolyP, pmap is a recursive function and we might 
be tempted to define it recursively in Template Haskell as well. This is not what 
we want, since it would make the generated code infinite, instead we want to 
generate a recursive function which we can do using a let binding. 



4.5 Instantiation 

The generic functions defined in this style are very easy to use. To map a function 
f over a rose tree tree we simply write 

$(pmap roseD) f tree 

where roseD is the representation of the rose tree datatype defined in Section 4.1. 
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pmap : : Regular d => (a -> b) -> d a -> d b 
pmap f = inn . fmap2 f (pmap f) . out 

pmap : : Regular -> Q Exp 

pmap d = Cl let pmap_d f = $(inn d) 

. $(fmap2 $ functorOf d) f (pmap_d f) 
. $(out d) 

in pmap_d 

|] 

Fig. 6. The pmap function in PolyP and Template Haskell 



5 Generic Haskell in Template Haskell 

In Section 4 we outlined an embedding of PolyP into Template Haskell. In this 
section we do the same for the core part of Generic Haskell. Features of Generic 
Haskell that we do not consider include constructor cases, type indexed types 
and generic abstraction. Type indexed types and generic abstraction should be 
possible to add without much difficulty; constructor cases might require some 
work, though. 



5.1 Datatype Codes 

To be consistent with how generic functions are defined in Generic Haskell we 
choose the following datatype for type codes: 

data Code = Sum I Prod I Unit 

I Con ConDescr I Label LabelDescr 
I Fun I TypeCon TypeName 

I App Code Code I Lam VarName Code I Var VarName 

The first seven constructors should be familiar to users of Generic Haskell, al- 
though you do not see the TypeCon constructor when matching on a specific 
datatype in Generic Haskell. The last three constructors App, Lam and Var you 
never see in Generic Haskell. The reason why they are not visible is that the 
interpretation of these type codes is hard-wired into Generic Haskell and cannot 
be changed by the programmer. By making them explicit we get the opportunity 
to experiment with this default interpretation. 

The types ConDescr and LabelDescr describe the properties of constructors 
and labels. In our implementation this is just the name, but it could also include 
information such as fixity and strictness. 

If we define the infix application of App to be left associative, we can write 
the type code for the list datatype as follows: 
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listCode = Lam "a" $ 

Sum £ App £ (Con " [] " £ App £ Unit) 

£ App £ (Con 

£ App £ (Prod 

£ App £ Var "a" 

£ App £ (TypeCon " [] " £ App £ Var "a") 

) 

) 

This is not something we want to write by hand for every new datatype, even 
though it can be made much nicer by some suitable helper functions. Instead 
we define a function that produces a type code given the abstract syntax of a 
datatype declaration. 

typeCode : : Q Dec -> Code 

Thus, to get the above code for the list datatype we just write 

typeCode (reifyDecl [] ) 

5.2 Structure Types 

The structure type for a datatype is designed to model the structure of that 
datatype in a uniform way. Similarly to PolyP, Generic Haskell uses binary sums 
and products to model datatype structures. We diverge from Generic Haskell in 
this implementation in that we use the Haskell prelude types Either and ( , ) for 
sums and products instead of defining our own. Another difference between our 
Template Haskell implementation and standard Generic Haskell is that construc- 
tors and labels are not modeled in the structure type. Compare, for example, 
the structure types of the list datatype in standard Generic Haskell (first) to our 
implementation (second) : 

type ListS a = Sum (Con Unit) (Con (Prod a [a])) -- std GH 
type ListS a = Either () (a, [a]) -- prototype 

Since we are not implementing all features of Generic Haskell, we can allow 
ourselves this simplification. 



5.3 Generic Function Definitions 

The type of a generic function in Generic Haskell depends on the kind of the 
datatype the function is applied to. At a glance it would seem like we could 
ignore the types of a generic function, since Template Haskell does not have any 
support for typing anyway. It turns out, however, that we need the type when 
generating the datatype level functions. There are no type level lambdas in the 
abstract syntax for types in Template Haskell and there is no datatype for kinds. 
We need both these things when defining kind indexed types 2 , so we define a 

Type level lambdas are not strictly needed, but they make things much easier. 



2 
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data Kind = Star I FunK Kind Kind 
data Type = ForallT [VarName] Context Type 
I VarT VarName I ConT ConName 
I TupleT Int I ListT I ArrowT 
I AppT Type Type 
I LamT [VarName] Type 

a --> b = ArrowT ‘AppT‘ a ‘AppT 1 b 

Fig. 7. Datatypes for kinds and types 

datatype for kinds and a new datatype for types, shown in Figure 7. The kind 
indexed type Map from Section 1.3 can be defined as 

_Map Star = LamT ["s","t"] $ VarT "s" --> VarT "t" 

_Map (FunK k 1) = LamT $ ForallT ["a" ,"b"] $ 

_Map k ‘AppT‘ a ‘AppT‘ b --> 

_Map 1 ‘AppT‘ AppT s a ‘AppT‘ AppT t b 
where [s,t,a,b] = map VarT ["s" , "t" , "a" , "b"] 

This is much clumsier than the Generic Haskell syntax, but we can make things 
a lot easier by observing that all type indexed types follow the same pattern. 
The only thing we need to know is the number of generic and non-generic ar- 
guments and the type for kind *. With this information we define the function 
kindlndexedType: 

kindlndexedType : : Int 
-> Int 
-> Type 

-> Kind -> Type 

Using this function we define the type Map as 

_Map = kindlndexedType 2 0 $ LamT ["s","t"] 

$ VarT "s" --> VarT "t" 

Now, the type of a generic function depends on the kind of the datatype it is 
applied to as well as the datatype itself. So we define a generic type to be a 
function from a kind and a type to a type. 

type GenericType = Kind -> Type -> Type 

The type of the generic map function from Section 1.3 can be defined as 

gmapType : : GenericType 

gmapType k t = _Map k ‘AppT‘ t ‘AppT‘ t 

A generic function translates a type code to abstract Haskell syntax - we 
capture this in the type synonym GenericFun: 



-- # generic arguments 
-- # non-generic arguments 
-- type for kind * 



type GenericFun = Code -> Q Exp 
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To deal with types containing variables and abstraction, we also need an envi- 
ronment in which to store the translation of variables. 

type GEnv = [(VarName, Q Exp)] 

type GenericFun’ = GEnv -> GenericFun 

With these types at our disposal we define the default translation, that will 
be the same for most generic functions. The function def aultTrans, defined 
in Figure 8, takes the name of the generic function that is being constructed 
and a GenericFun’ that handles the non-standard translation and produces a 
GenericFun’. The idea is that a generic function should call defaultTrans on 
all type codes that it does not handle (see the generic map function in Figure 9 
for an example). 



defaultTrans : : VarName -> GenericFun’ -> GenericFun’ 
defaultTrans name gfun env t = case t of 
Con _ -> [| id |] 

Label _ -> [| id |] 

TypeCon c -> varE $ gName name c 

App st -> [I $(gfun env s) $(gfun env t) |] 

Lam x t -> [| \gx -> $(gfun ( (x, [ I gx I ] ) : env) t) |] 

Var x -> fromJust $ lookup x env 

varE : : String -> Q Exp 

Fig. 8. Default generic translations 



The default translation for constructors and labels is the identity function. 
Since the structure type corresponding to Con and Label is the type level identity 
function of kind * — > *, a generic function applied to Con or Label expects a 
generic function for a type of kind * and should return a generic function for the 
same type. 

The default translation of a named type is to call the specialized version of the 
generic function for that type. The function gName takes the name of a generic 
function and the name of a type and returns the name of the specialization of 
the generic function to that type. In our implementation it is the responsibility 
of the user to make sure that this specialization exists. 

In Generic Haskell, the first three cases in defaultTrans can be changed 
by the programmer, for instance using the name of a constructor when pretty 
printing or defining special cases for particular types. The last three cases, on the 
other hand, are hidden from Generic Haskell users. A type level application is 
always translated into a value application, when encountering a type abstraction 
a generic function takes the translation of the abstracted variable as an argument, 
stores it in the environment and calls itself on the body of the abstraction. The 
translation of a type variable is looked up in the environment. 
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Provided that it does not need to change the default actions a generic function 
only has to provide actions for Sum, Prod and Unit. The definition of the generic 
map function from Section 1.3 is shown in Figure 9. For Sum and Prod the generic 
map function returns the map functions for Either and ( , ) , and mapping over 
the unit type is just the identity functions. For all other type codes we call the 
defaultTrans function to perform the default actions. 



gmap : : GenericFun 
gmap t = gmap’ [] t 
where 

gmap’ env t = case t of 

Sum -> [| (-+-) |] 

Prod -> [| (-*-) |] 

Unit -> [| id |] 

t -> defaultTrans "gmap" gmap’ env t 
Fig. 9. Generic map 



5.4 From Structure Types to Datatypes 

The generic functions defined in the style described in the previous section gen- 
erate specializations for structure types, so from these structure type functions 
we have to construct functions for the corresponding datatypes. In our imple- 
mentation of PolyP (Section 4) this was the responsibility of the programmer 
of the generic function. The reason for this was that in PolyP, the conversion 
could be done in several different ways, yielding different datatype functions. In 
Generic Haskell, on the other hand, we have a unique datatype function in mind 
for every structure type function. For instance, look at the type of the function 
generated by applying gmap to the code for the list datatype: 

gmap_ListS :: (a -> b) -> Either () (a, [a]) 

-> Either () (b, [b] ) 
gmap_ListS = $(gmap lists) 

From this function we want to generate the map function for List with type 
gmap_List : : (a -> b) -> [a] -> [b] 

To be able to do this we first have to be able to convert between the List 
datatype and its structure type. For this purpose we define a function structEP 
that given the names and arities of a datatype’s constructors generates the con- 
version functions between the datatype and its structure type. 

structEP :: TypeName -> [(ConName, Int)] -> Q Dec 

The structEP function generates a declaration of the conversion functions, so 
for the list datatype it would generate something like the following: 
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listEP :: EP [a] (Either () (a, [a])) 
listEP = EP out inn 



where 








out 


[] 


= Left 


0 


out 


(x : xs) 


= Right 


(x,xs) 


inn 


(Left 


0) = n 




inn 


(Right 


(x,xs)) = x:xs 





The EP type, shown in Figure 10, models an embedding projection pair. 

data EP a b = EP { from : : a -> b, to : : b -> a } 

idEP : : EP a a 
idEP = EP id id 

funEP :: EP a a’ -> EP b b’ -> EP (a -> b) (a’ -> b’) 

funEP epA epB = EP (\f -> from epB . f . to epA) 

(\g -> to epB . g . from epA) 

Fig. 10. Embedding projection pairs 



Using listEP we define the map function for the list datatype as 
gmap_List : : (a -> b) -> [a] -> [b] 

gmap_List f = to (funEP listEP listEP) (gmap_ListS f) 

The embedding projection pair is generated directly from the type of the generic 
function. In this case an embedding projection pair of type 

EP ([a] -> [b] ) (Either () (a, [a]) -> Either () (b , [b] ) ) 

should be generated. Embedding projection pairs between function types can be 
constructed with funEP, and listEP can convert between a list and an element of 
the list structure type. We define the function typeEP to generate the appropriate 
embedding projection pair. 

typeEP : : Q Exp -> Kind -> GenericType -> Q Exp 

The first argument to typeEP is the embedding projection pair converting be- 
tween the datatype and its structure type, the second argument is the kind of 
the datatype and the third argument is the type of the generic function. So to 
get the embedding projection pair used in gmap_List we write 

typeEP [| listEP |] (KFun Star Star) gmapType 
5.5 Instantiation 

The focus of this article is on fast prototyping of generic programming languages; 
this means that we do not make great efforts to facilitate the use of the generic 
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functions. In particular what we do not do is figuring out which specializations 
to generate. Instead we provide a function instantiate that generates the def- 
inition of the specialization of a generic function to a particular datatype, as 
well as a function structure, that generates the embedding projection pair def- 
inition converting between a datatype and its structure type using the function 
structEP from Section 5.4. 

type Generic = (VarName, GenericType, GenericFun) 
type Datatype = (TypeName, Kind, Code) 

instantiate : : Generic -> Datatype -> Q [Dec] 
structure : : Datatype -> Q [Dec] 

Using these functions a map function for rose trees can be generated by 

data Rose a = Fork a [Rose a] 

listD = ("[]", KFun Star Star, typeCode (reifyDecl [] ) ) 
roseD = ("[]", KFun Star Star, typeCode (reifyDecl Rose)) 
gmapG = ("gmap", gmapType , gmap) 

$ (structure listD) 

$ (structure roseD) 

$ (instantiate gmapG listD) 

$ (instantiate gmapG roseD) 

Since the rose trees contain lists we have to create specializations for the list 
datatype as well. The code generated for gmap specialized to rose trees will look 
something like the following (after some formatting and alpha renaming). Note 
that gmap_RoseS uses both gmap_List and gmap_Rose. 

gmap_RoseS :: (a -> b) -> (a, [Rose a]) -> (b, [Rose b] ) 
gmap_RoseS = \f -> f gmap_List (gmap_Rose f) 

gmap_Rose : : (a -> b) -> Rose a -> Rose b 
gmap_Rose f = to (funEP roseEP roseEP) (gmap_RoseS f) 



6 Conclusions and Future Work 

Efforts to explore the design space of generic programming have been hampered 
by the fact that implementing a generic programming language is a daunting 
task. In this paper we have shown that this does not have to be the case. We 
have presented two prototype implementations of generic programming approx- 
imating PolyP and Generic Haskell. Thanks to the Template Haskell machinery, 
these prototypes could be implemented in a short period of time (each imple- 
mentation consists of a few hundred lines of Haskell code). Comparing these two 
implementations we obtain a better understanding of the design space when it 
comes to implementations of generic programming languages. 
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There are a few different areas one might want to focus future work on: 

— The idea of fast prototyping is to make it possible to experiment with dif- 
ferent ideas in an easy way. So far, most of our work has been concentrated 
on how to write the prototypes and not so much on experimenting. 

— One of the biggest problems with current generic programming systems is 
efficiency. The conversions between datatypes and structure types takes a 
lot of time and it would be a big win if one could remove this extra cost. We 
have started working on a simplifier for Haskell expressions that can do this. 

— It would be interesting to see how other generic programming styles fit into 
this framework. In particular one could look at the Data. Generics libraries 
in GHC [13] and also at the generic traversals of adaptive OOP [14]. 

— The design of a generic programming language includes the design of a type 
system. In this paper we have ignored the issue of typing, leaving it up to 
the Haskell compiler to find type errors in the specialized code. Is there an 
easy way to build prototype type systems for our implementations? 
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Abstract. Functional transposition is a technique for converting relations into 
functions aimed at developing the relational algebra via the algebra of functions. 
This paper attempts to develop a basis for generic transposition. Two instances 
of this construction are considered, one applicable to any relation and the other 
applicable to simple relations only. 

Our illustration of the usefulness of the generic transpose takes advantage of the 
free theorem of a polymorphic function. We show how to derive laws of relational 
combinators as free theorems of their transposes. Finally, we relate the topic of 
functional transposition with the hashing technique for efficient data representa- 
tion. 



1 Introduction 

This paper is concerned with techniques for functional transposition of binary relations. 
By functional transposition we mean the, faithful representation of a relation by a (total) 
function. But - what is the purpose of such a representation? 

Functions are well-known in mathematics and computer science because of their 
rich theory. For instance, they can be dualized (as happens e.g. with the projection/ in- 
jection functions), they can be Galois connected (as happens e.g. with inverse functions) 
and they can be parametrically polymorphic. In the latter case, they exhibit theorems 
“for free’’ [20] which can be inferred solely by inspection of their types. 

However, (total) functions are not enough. In many situations, functions are partial 
in the sense that they are undefined for some of their input data. Programmers have 
learned to deal with this situation by enriching the codomain of such functions with a 
special error mark indicating that nothing is output. In C/C++, for instance, this leads 
to functions which output pointers to values rather than just values. In functional lan- 
guages such as Haskell [13], this leads to functions which output A/ay /+:- values rather 
than values, where Maybe is datatype Maybe a = Nothing \ Just a. 

Partial functions are still not enough because one very often wants to describe what 
is required of a function rather than prescribe how the function should compute its re- 
sult. A well-known example is sorting: sorting a list amounts to finding an ordered per- 
mutation of the list independently of the particular sorting algorithm eventually chosen 
to perform the task (eg. quicksort, mergesort, etc.). So one is concerned not only with 
implementations but also with specifications , which can be vague (eg. which square root 
is meant when one writes “y^x”?) and non-deterministic. Functional programmers have 
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learned to cope with (bounded) non-determinism by structuring the codomain of such 
functions as sets or lists of values. 

In general, such powerset valued functions are models of binary relations: for each 
such / one may define the binary relation R such that bRa means b £ (fa) for all 
suitably typed a and b. Such R is unique for the given /. Conversely, any binary relation 
R is uniquely transposed into a set-valued function /. The existence and uniqueness of 
such a transformation leads to the identification of a transpose operator A [6] satisfying 
the following universal property, 

f = AR = ( bRa = b € f a) (1) 

for all R from A to /i and / : A V B ■ ( VB denotes the set of all subsets of B.) 

The power-transpose operator A establishes a well-known isomorphism between 
relations and set-valued functions which is often exploited in the algebra of relations, 
see for instance textbook [6], Less popular and usually not identified as a transpose is 
the conversion of a partial function into a Maybe-valued function, for which one can 
identify, by analogy with (1), isomorphism F defined by (for all suitably typed a and b ) 

f = r R= ( bRa = (f a = Just b)) (2) 

where R ranges over partial functions. 

Terms total and partial are avoided in relation algebra because they clash with a dif- 
ferent meaning in the context of partial orders and total orders, which are other special 
cases of relations. Instead, one writes entire for total, and simple relation is written in- 
stead of partial function. The word function is reserved for total, simple relations which 
find a central place in the taxonomy of binary relations depicted in Fig. 1 (all other 
entries in the taxonomy will be explained later on). 

Paper objectives. This paper is built around three main topics. First, we want to show 
that A is not the only operator for transposing relations. It certainly is the most gen- 
eral, but we will identify other such operators as we go down the hierarchy of binary 
relations. Our main contribution will be to unify such operators under a single, generic 
transpose construct based on the notion of generic membership which extends to 
collective types other than the powerset [6, 10, 1 1 ]. In particular, one of these operators 
will be related with the technique of representing finite data collections by hash-tables, 
which are efficient data-structures well-known in computer science [21,12]. 

Second, we want to stress on the usefulness of transposing relations by exploit- 
ing the calculation power of functions, namely free theorems. Such powerful reasoning 
devices can be applied to relations provided we represent relations as functions (by 
functional transposition), reason functionally and come back to relations where appro- 
priate. In fact, several relational combinators studied in [6] arise from the definition 
of the power-transpose A R of a relation R. However, some results could have been 
produced as free-theorems, as we will show in the sequel. 

Last but not least, we want to provide evidence of the practicality of the pointfree 
relation calculus. The fact that pointfree notation abstracts from “points” or variables 
makes the reasoning more compact and effective, as is apparent in our final example 
on hash-tables, if compared with its pointwise counterpart which one of the authors did 
several years ago [16]. 
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bijection (isomorphism) 

Fig. 1 . Binary relation taxonomy 



Related work. In the literature, equations (1) and (2) have been dealt with in disparate 
contexts. While (1) is adopted as “the” standard transpose in [6], for instance, (2) is 
studied in [9] as an example of an adjunction between the categories of total and partial 
functions. From the literature on the related topic of generic membership we select [6] 
and [11], 

Paper structure. This paper is structured as follows. In the next section we present 
an overview of (pointfree) relation algebra. Section 3 presents our relational study of 
generic transpose. In section 4, the two transposes (1) and (2) are framed in the generic 
view. Section 5 presents an example of reasoning based on the generic transpose op- 
erator and its instances. In the remainder of the paper we relate the topic of functional 
transposition with the hash table technique for data representation and draw some con- 
clusions which lead to plans for future work. 



2 Overview of the Relational Calculus 

Relations. Let B - — - — A denote a binary relation on datatypes A (source) and B 
(target). We write bRa to mean that pair (6, a) is in It. The underlying partial order 
on relations will be written R. C S, meaning that S is either more defined or less 
deterministic than R, that is, R C S = bRa => bSa for all a, b. It LJ S denotes the 
union of two relations and T is the largest relation of its type. Its dual is _L, the smallest 
such relation. Equality on relations can be established by C -antisymmetry: R = S = 
RC S AS C R. 

Relations can be combined by three basic operators: composition ( R ■ S ), converse 
( R° ) and meet ( R n S ). R° is the relation such that a(R°)b iff bRa holds. Meet cor- 
responds to set-theoretical intersection and composition is defined in the usual way: 
b(R ■ S)c holds wherever there exists some mediating a £ A such that bRa A aSc. 
Everywhere T = R ■ S holds, the replacement of T by It ■ S will be referred to as a 

R 

“factorization” and that of R ■ S by T as “fusion”. Every relation B A admits 
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two trivial factorizations, R = R - id a and R = ids • R where, for every X, idx is the 
identity relation mapping every element of X onto itself. 

Coreflexives. Some standard terminology arises from the id relation: a (endo)relation 

R 

A — A (often called an order) will be referred to as reflexive iff id a C R holds 
and as coreflexive iff R C idA holds. As a rule, subscripts are dropped wherever types 
are implicit or easy to infer. 

Coreflexive relations are fragments of the identity relation which can be used to 
model predicates or sets. The meaning of a predicate p is the coreflexive [p] such 
that b\p\a = (b = a) A (p a), that is, the relation that maps every a which satisfies 
p (and only such a) onto itself. The meaning of a set S C A is [Aa.a £ S'], that 
is, 6 [S'] a = (b = a) A a £ S . Wherever clear from the context, we will omit the [ 
brackets. 

Orders. Preorders are reflexive, transitive relations, where R is transitive iff R - R C R 
holds. Partial orders are anti-symmetric preorders, where R is anti-symmetric wherever 
R n R° C id holds. A preorder R is an equivalence if it is symmetric, that is, if 
R= R°. 

Converse is of paramount importance in establishing a wider taxonomy of binary 
relations. Let us first define the kernel of a relation, ker R = R° ■ R and its dual, 
img R = ker ( R° ) , called the image of R. Alternatively, we may define img R = 
RR°, since converse commutes with composition, ( II ■ S)° = S°-R° and is involutive, 
that is, ( R°)° = R. Kernel and image lead to the following terminology: a relation R 
is said to be entire (or total) iff its kernel is reflexive; or simple (or functional) iff its 
image is coreflexive. Dually, R is surjective iff R° is entire, and R is injective iff R° is 
simple. This terminology is recorded in the following summary table: 





Reflexive 


Coreflexive 


ker R 


entire R 


injective R 


img R 


surjective R 


simple R 



Functions. A relation is a. function iff it is both simple and entire. Functions will be 
denoted by lowercase letters (/, g, etc.) and are such that bfa means b = f a. Function 
converses enjoy a number of properties of which the following is singled out because 
of its role in pointwise-pointfree conversion [3] : 

Kf° -R-g)a = (f b)R(g a) (4) 

The overall taxonomy of binary relations is pictured in Fig. 1 where, further to the 
standard classification, we add representations and abstractions. These are classes of 
relations useful in data-refinement [15]. Because of C -antisymmetry, img S = id wher- 
ever S is an abstraction and ker R = id wherever R is a representation. This ensures 
that "no confusion’’ arises in a representation and that all abstract data are reachable by 
an abstraction (“no junk”). 

Isomorphisms (such as A and F above) are functions, abstractions and represen- 
tations at the same time. A particular isomorphism is id. which also is the smallest 
equivalence relation on a particular data domain. So, b id a means the same as b = a. 
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Functions and relations. The interplay between functions and relations is a rich part of 
the binary relation calculus. This arises when one relates the arguments and results of 
pairs of functions / and g in, essentially, two ways: 



f-SCR-g (5) 

f° ■ S = R ■ g (6) 

As we shall see shortly, (5) is equivalent to S C f° ■ R ■ g which, by (4), means that f 
and g produce /(-related outputs f b and g a provided their inputs are S'-related ( bSa ). 
This situation is so frequent that one says that, everywhere / and g are such that (5) 
holds, / is (R <— S)-related to g: 

f(R <— S)g = f ■ S C R ■ g cf. diagram B ■< A (7) 

/ 1 C Is 



For instance, for partial orders R, S :=<, C, fact /(<<—□)/ means that / is monotone. 
For R, S :=<, id, fact /(< 4— id)g means 



/ <9 = / C <-g (8) 

that is, / and g are such that f b < g b for all b. Therefore, < lifts pointwise ordering 
< to the functional level. In general, relation R <— S will be referred to as “Reynolds 
arrow combinator” (see section 5), which is extensively studied in [3]. 

Concerning the other way to combine relations with functions, equality (6) becomes 
interesting wherever R and S are preorders, 

f° • C = < • g cf. diagram: ^ B ~~ C ^ (9) 

g 

in which case /, g are always monotone and said to be Galois connected. Function f 
(resp. g) is referred to as the lower (resp. upper) adjoint of the connection. By introduc- 
ing variables in both sides of (9) via (4) we obtain 

(/ b) C a = b < (g a) (10) 

Note that (9) boils down to f° = g (ie. / = g°) wherever < and C are id, in which 
case / and g are isomorphisms, that is, f° is also a function and fb = a = b = f°a 
holds. 

For further details on the rich theory of Galois connections and examples of appli- 
cation see [1, 3]. Galois connections in which the two preorders are relation inclusion 
(<, C := C,C) are particularly interesting because the two adjoints are relational com- 
binators and the connection itself is their universal property. The following table lists 
connections which are relevant for this paper: 
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(. f X)CY = XC ( g Y) 


Description 


/ 


9 


Obs. 


Converse 


(-)° 


(-)° 




Shunting rule 


(/•) 


c n 


NB: / is a function 


“Converse” shunting rule 


(•/°) 


(•/) 


NB: / is a function 


Left-division 


(R-) 


(R\ ) 


read “R under ...” 


Right-division 


(■R) 


( /R) 


read “. . . over R” 


Difference 


(--R) 


(R u ) 





From the two of these called shunting rules one infers the very useful fact that equating 
functions is the same as comparing them in either way: 

f = g=fCg = gCf (12) 

Membership. Equation (1) involves the set-theoretic membership relation A^— — V A . 
Sentence a G x (meaning that “a belongs to x” or “a occurs in x ”) can be generalized 
to x’s other than sets. For instance, one may check whether a particular integer occurs 
in one or more leaves of a binary tree, or of any other collective or container type F. 

Such a generic membership relation will have type A — — F A , where F is a 
type parametric on A. Technically, the parametricity of F is captured by regarding it as 
a relator [5], a concept which extends functors to relations: F A describes a parametric 
type while F R is a relation from F .1 to F B provided II is a relation from A to B. 
Relators are monotone and commute with composition, converse and the identity. 

The most simple relators are the identity relator Id, which is such that Id A = A and 
Id R = R, and the constant relator K (for a particular concrete data type K) which is 
such that K ,4 = K and K /? = idx- 

Relators can also be multi-parametric. Two well-known examples of binary relators 
are product and sum. 



R x S = (R ■ tti, S ■ 7t 2 ) (13) 

R+S=[h-R,i2-S] (14) 

where 7 Ti, 7T2 denote the projection functions of a Cartesian product, ii,i 2 denote the 
injection functions of a disjoint union, and the split/either relational combinators are 
defined by 



(R, S) = 7T° • R n tt° 2 • S (15) 

[R,S\ = (R-i° 1 )U(S-q) (16) 

By putting these four kinds of relator (product, sum, identity and constant) together 
with fixpoint definition one is able to specify a large class of parametric structures - 
called polynomial - such as those implementable in Haskell. For instance, the Maybe 
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datatype is an implementation of polynomial relator F = Id + 1 (ie. FA = A + 1), 
where 1 denotes the singleton datatype, written () in Haskell. 

There is more than one way to generalize A - — — VA to relators other than the 
powerset. (For a thorough presentation of the subject see chapter 4 of [10].) For the 

purpose of this paper it will be enough to say that A - — F A , if it exists, is a lax 
natural transformation [6], that is, 

Gf-F RCR-g f (17) 

holds. Moreover, relators involving +, x, Id and constants have membership defined 
inductively as follows: 



def . 

Gk = -L 

def . , 

£|d — id 

GfxG = f (Gf -tit) U (Gg "7r2) 
Gf+g = f [Gf, Gg] 



(18) 

(19) 

( 20 ) 
( 21 ) 



3 A Study of Generic Transposition 

Thanks to rule (4), it is easy to remove variables b and a from transposition rules (1) 
and (2), yielding 

f = AR = (R = € ■ f) (22) 

f = T R = (R = i\- f) (23) 

where, in the second equivalence, R ranges over simple relations and Just is replaced 
by injection i\ associated with relator Id + 1. In turn, / and It can also be abstracted 
from (22,23) using the same rule, whereby we end up with A = (G-)° and r = (z}-)° . 

The generalization of both equations starts from the observation that, in the same 
way G is the membership relation associated with the powerset, i\ is the membership 
relation associated with Id + 1, as can be easily checked: 

Gid+l 

{ by (21)} 

[G id ; Gi] 

= { by (19) and (18)} 

[if A] (24) 

= { by (16) and properties of _L } 

id ■ 

= { identity } 

i°i 

This suggests the definitions and results which follow. 
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Definition. Given a relator F with membership relation Gf, a particular class of binary 
R 

relations A - — - — B is said to be F -transposable iff, for each such R, there exists a 

unique function / : B FA such that Gf ■ f = R holds. This is equivalent (by 

skolemisation) to saying that there exists a function I f (called the F -transpose) such 
that, for all such R and /, 

f = r F R = Gf-f = R cf. diagram A ^—- — B (25) 

6f 

F A 

In other words, such a generic F-transpose operator is the converse of membership 
post-composition: 




r F = (€f-)° (26) 

The two instances we have seen of (25) are the power-transpose (F A = VA) and the 
Alaybe- transpose ( F A = A + 1). While the former is known to be applicable to every 
relation [6] . the latter is only applicable to simple relations, a result to be justified after 
we review the main properties of generic transposition. These extend those presented in 
[6] for the power-transpose. 

Properties. Cancellation and reflection 



€f r F R = R (27) 

/ f Gf = id, (28) 

arise from (25) by substitutions / := r F R and / : id, respectively. Fusion 

r F (T • S) = ( r F T ) • S <= (TpT) • S' is a function (29) 

arises in the same way - this time for substitution / := (Ij=T) • S - as follows (assuming 
the side condition ensuring that (TpT) • S is a function): 

UYT) • S = r F R = Gf • ((/>T) • S) = R 

= { associativity } 

(Gf r F T)-S = R 
= { cancellation (27) } 

T ■ S = R 



The side condition of (29) requires S to be entire but not necessarily simple. In fact, 
it suffices that img S C ker ( r F T ) since, in general, the simplicity of / • S equivales 
img S C ker f : 
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img S C ker f 
= { definitions } 

(S -S°)Cf°-f 

= { id is the unit of composition } 

(S ■ S°) C f° ■ id ■ f 
= { shunting rules (11)} 

f-(S-S°)-f°Cid 

= { composition is associative ; converse of composition } 

(f-S)-(f- S)° C id 
= { definition of img } 

img (/ • S) C id 
= { simplicity } 

(/ • S) is simple 

In summary, the simplicity of (entire) S' is a sufficient (but not necessary) condition 
for the fusion law (29) to hold. In particular, S can be a function, and it is under this 
condition that the law is presented in [6] 1 . 

Substitution / := JpS in (25) and cancellation (27) lead to the injectivity law, 

r F S = r F R = S = R (30) 

Finally, the generic version of the absorption property, 

F R- T f S = T F {R- S) 4= R - € f C Gf F R (31) 

is justified as follows: 

f r ■ r F s = r F (R ■ s) 

= { universal property (25 ) } 

Gf • F R ■ r F S = R ■ S 
= { assume Gf • F R = R ■ Gf) } 

R • G f • r F S = R ■ S 
= { cancellation (27) } 

R ■ S = R ■ S 

The side condition of (31) arises from the property assumed in the second step of the 
proof. Together with (17), it establishes the required equality by anti-symmetry, which 
is equivalent to writing F R= r F (R- Gf) in such situations. 

1 Cf. exercise 5.9 in [6]. See also exercise 4.48 for a result which is of help in further reasoning 
about the side condition of (29). 
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Unit and inclusion. Two concepts of set-theory can be made generic in the context 
above. The first one has to do with singletons , that is, data structures which contain a 
single datum. The function tf mapping every A to its singleton of type F is obtainable 
by transposing id, tf = r F id, and is such that (by the fusion law) tf ■ / = Tf/. Another 
concept relevant in the sequel is generic inclusion , defined by 

FA^FA (32) 

and involving left division (11), the relational operator which is defined by the fact that 
(R \ ) is the upper-adjoint of (R ■ ) for every R. 

4 Instances of Generic Transposition 

In this section we discuss the power-transpose (F = V) and the Afayhe-transpose 
(F = Id + 1) as instances of the generic transpose (25). Unlike the former, the latter 
is not applicable to every relation. To conclude that only simple relations are Maybe- 
transposable, we first show that, for every F-transposable R, its image is at most the 
image of G f : 

img RC img Gf (33) 

The proof is easy to follow: 
img R 

= { definition } 

RR° 

= { R As F-transposable ; cancellation (27) } 

(gf ■ r F R ) • (gf • rpR ) 0 

= { converses } 

Gf • r F R ■ (r F f?)° • g f ° 

C { r F R is simple ; monotonicity } 

€f-Gf° 

= { definition } 

img G F 

So, Gf restricts the class of relations R which are f-transposable. Concerning the 
power-transpose, it is easy to see that img Gf= T since, for every a , a', there exists at 
least the set {a, a'} which both a and a' belong to. Therefore, no restriction is imposed 
on img R and transposition witnesses the well-known isomorphism ( 2 A ) B = 2 BxA 
(writing 2 A for VA and identifying every relation with its graph , a set of pairs). 

By contrast, simple memberships can only be associated to the transposition of sim- 
ple relations. This is what happens with Gid+i= which, as the converse of an injec- 
tion, is simple (3). 
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Conversely, appendix A shows that all simple relations are (Id + l)-transposable. 
Therefore, (Id + l)-transposability defines the class of simple relations and witnesses 
isomorphism (B + 1) A = A —*■ B , where A -*-B denotes the set of all simple relations 
from A to B 2 . 

Another difference between the two instances of generic transposition considered 
so far can be found in the application of the absorption property (31). That its side 
condition holds for the Maybe- transpose is easy to show: 

R ■ i\ C i° ■ (R + id) 

= { shunting } 

h ■ R C (R + id) ■ i\ 

4= { anti-symmetry } 

• R = (R + id) ■ i\ 

= { R + S (14) is a coproduct [6] } 

i\ ■ R = i\ ■ R 

Concerning the power-transpose, [6] define the absorption property for the existential 
image functor, Eli = A ( R ■ e) .which coincides with the powerset relator for functions. 
However, E is not a relator 3 . So, the absorption property of the power-transpose can only 
be used where R is a function: V f ■ AS = A(f ■ S) . 

Finally, inclusion (32) for the power-transpose is the set-theoretic subset ordering 
[6], while its Maybe instance corresponds to the expected ' fiat-cpo ordering”: 



a:(G| d +i \ €id+i)2/ = Va.a: = {h a) => y = (*i a) 



So Nothing will be included in anything and every “non -Nothing , ’ x will be included 
only in itself 4 . 

5 Applications of Generic Transpose 

The main purpose of representing relations by functions is to take advantage of the 
(sub)calculus of functions when applied to the transposed relations. In particular, trans- 
position can be used to infer properties of relational combinators. Suppose that / ® g 
is a functional combinator whose properties are known, for instance, / ® g = [f,g] for 
which we know universal property 

<34) 

2 This isomorphism is central to the data refinement calculus presented in [15]. 

3 See [10] and exercise 5.15 in [6]. 

4 This is, in fact, the ordering <= which is derived for Maybe as instance of the Ord class in 
the Haskell Prelude [13]. 
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We may inquire about the corresponding property of another, this time relational, com- 
binator R® S induced by transposition: 



r F (R®s) = (t f r) © (r F s) 


(35) 


{ (25) } 




R®s = e F -((r F R)®(r F s)) 


(36) 



This can happen in essentially two ways, which are described next. 

Proof of universality by transposition. It may happen that the universal property of 
functional combinator © is carried intact along the move from functions to relations. A 
good example of this is relational coproduct, whose existence is shown in [6] to stem 
from functional coproducts (34) by transposition 3 . One only has to instantiate (34) for 
k, f, g := r F T, T F R , T F S and reason: 

t f t = [r F R, r F s] = ( r F r ) • h = r F R a ( r F r ) • i 2 = r F s 

= { (25) and fusion (29) twice, for S := i\,i 2 } 

t = g • [r F R. r F s } = r F (r ■ h) = r F R a t f (t ■ i 2 ) = r F s 

= { injectivity (30) } 

T = e • [r F R, r F S] = T-i 1 = RAT-i 2 = S 
= { dehne [R, S] = G ■ [ r F R , r F S } } 

T = [R, S) = T ■ h = R AT ■ i 2 = S 
= { coproduct definition } 

[i?, S'] is a coproduct 

Defined in this way, relational coproducts enjoy all properties of functional coproducts, 
eg. fusion, absorption etc. 

This calculation, however, cannot be dualized to the generalization of the split- 
combinator (/, g) to relational ( R,S ). In fact, relational product is not a categorical 
product, which means that some properties will not hold, namely the fusion law, 

(. 9 1 h) • / = (g- /, h ■ f) (37) 

when g, h, f are replaced by relations. According to [6], what we have is 

(R, S) ■ f = (R ■ f , S ■ f) (38) 

whose proof can be carried out by resorting to the explicit definition of the split combi- 
nator (15) and some properties of simple relations grounded on the so-called modular 
law 6 . 

5 For the same outcome without resorting to transposition see §2.5.2 of [10]. 

6 See Exercise 5.9 in [6], 
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In the following we present an alternative proof of (38) as an example of the calcula- 
tion power of transposes combined with Reynolds abstraction theorem in the pointfree 
style [3]. The proof is more general and leads to other versions of the law, depending 
upon which transposition is adopted, that is, which class of relations is considered. 
From the type of functional split, 

<-, _) : {{A x f?) 4— C) <— ((A 4— C) x (B <- C)) (39) 

we want to define the relational version of this combinator - denote it by (_ 0 _) for the 
time being - via the adaptation of (_, _) (39) to transposed relations, to be denoted by 
(_©_). This will be of type 

t = (F (A x B) 4 - C) 4 - ((F4 4 - (J) x (F B <— C))) (40) 

Reynolds abstraction theorem. Instead of defining (_ © _) explicitly, we will reason 
about its properties by applying the abstraction theorem due to J. Reynolds [19] and 
advertised by R Wadler [20] under the “theorem for free” heading. We follow the point- 
free styled presentation of this theorem in [3], which is remarkably elegant: let / be 
a polymorphic function f : t, whose type t can be written according to the following 
“grammar” of types: 

t ::= t' <— t" 

t ::= F(ti, . . . , t n ) for n- ary relator F 
t ::= v for v a type variable (= polymorphism “dimension”) 

Let V be the set of type variables involved in type t ; {R v } ve y be a ^-indexed family of 
relations ( f v in case all such R v are functions); and R t be a relation defined inductively 
as follows: 

,...,t n ) i • • • ) Rtn) 

Rt:=v — Rv 

Rf.=t'^t" = Rf Rt" 

where R t : 4— R t „ is defined by (7). The free theorem of type t reads as follows: given 
any function f : t and V as above, f Rt f holds for any relational instantiation of type 
variables in V. Note that this theorem is a result about t and holds for any polymorphic 
function of type t independently of its actual definition 7 . 

In the remainder of this section we deduce the free theorem of type t (40) and draw 
conclusions about the fusion and absorption properties of relational split based on such 
a theorem. First we calculate R t : 



Rt 

{ induction on the structure of t (40) } 



7 See [3] for comprehensive evidence on the the power of this theorem when combined with 
Galois connections, which stems basically from the interplay between equations (5) and (6). 
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(F (Ra x Rb) Rc) ((F Ra Rc) x (F R B <- R c ))) 

= { substitution Ra, Rb, Rc := R, S, Q in order to remove subscripts } 

(F(i?x S)«-Q)<-((Ffl«-Q) x (F S <— Q))) 

Next we calculate the free theorem of (_ ® _) : t : 

= { expansion of R t } 

(_© _)(F (f? x -S') <— Q) ((Ff? <— Q) x (FS«-Q)))(_©_) 

= { meaning of Reynolds arrow combinator (7) } 

(_©_)- ((F R*-Q) x (F 5 <— Q)) C F(JJx S) <- Q) •(.©_) 

= { shunting (11)} 

(F R «- Q) x (F S «- Q) C (. © _)° ■ (F (R x S) <- Q) ■ (_ ® .) 

= { going pointwise and (4) } 

(f,g)((FR <- Q) x (F S Q))[h, k) =► (/ © ff )(F (RxS)<- Q)(h © fc) 

= { product relator and (7) } 

/(F i? <- Q)/i A g(F S <- Q)jfc =» (/ © • Q C F (i? x 5) • (/i © fc) 

= { Reynolds arrow combinator (7) three times } 

/ • <5 Q F R ■ h A g ■ Q C F S ■ k (/ © g) ■ Q C F (R x S) ■ {h © k) 

Should we replace functions f,h,g,k by transposed relations I f U, 1 f V, IfX. I fZ, 
respectively, we obtain 

((IW) © (r F x)) ■ q c F (R X S) ■ ((R F V) © (r F z)) (4i) 

provided conjunction 

(r F u) -QCFR- (r F v) A (r F x) -qcfs- {r F z) (42) 

holds. Assuming (35), (41) can be re-written as 

r F (y®x)-QcF(Ry.s)-r F (v®z) (43) 

At this point we restrict Q to a function q and apply the fusion law (29) without extra 
side conditions: 



r F ((U® X)-q)CF(Rx S)-T f {V®Z) (44) 

For R, S := id, id we will obtain -“for free” - the standard fusion law 



(U ® X) ■ q = (U ■ q <g> A • q) 
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presented in [6] for the split combinator (38), ie. for (R®S) = (i?, S). In the reasoning, 
all factors involving R and S disappear and fusion takes place in both conjuncts of (42). 
Moreover, inclusion (C) becomes equality of transposed relations - thanks to (12) - and 
injectivity (30) is used to remove all occurrences of I 'f. 

Wherever R and S are not identities, one has different results depending on the 
behaviour of the chosen transposition concerning the absorption property (31 ). 

Maybe transpose. In case of simple relations under the Maybe- transpose, absorption 
has no side condition, and so (44) rewrites to 

(U®X)-q=(RxS)-(V®Z) (45) 

by further use of (12) - recall that transposed relations are functions - and injectivity 
(30), provided (42) holds, which boils down to U ■ q = R ■ V and X ■ q = S ■ Z under 
a similar reasoning. For q:= id and (_ ® _) instantiated to relational split, this becomes 
absorption law 



(R ■ V, S ■ Z) = (R x S) ■ ( V. , Z) if R, S, V, Z are simple (46) 

In summary, our reasoning has shown that the absorption law for simple relations is a 
free theorem. 

Power transpose. In case of arbitrary relations under the power-transpose, absorption 
requires R and S in (44) to be functions (say r, s), whereby the equation re-writes to 

r F ((U ® X) ■ q) C r F {(r x s) ■ {V ® Z)) (47) 

provided 7j=((7 • q) C /j=(r • V) and Pp(X • q) C Jj=(s • Z) hold. Again by combined 
use of (12) and injectivity (30) one gets 

(U ® X) ■ q = F (r x s) • (V <g> Z) (48) 

provided U ■ q — r ■ V and X ■ q = s ■ Z hold. Again instantiating q := id and 
(_ ® _) = (_, _), this becomes absorption law 

(r • V, s ■ Z) = (r x s) ■ (V, Z) (49) 

Bird and Moor [6] show, in (admittedly) a rather tricky way, that product absorption 
holds for arbitrary relations. Our calculations have identified two restricted versions of 
such a law - (46) and (49) - as “free” theorems, which could be deduced in a more 
elegant, parametric way. 



6 Other Transposes 

So far we have considered two instances of transposition, one applicable to any relation 
and the other restricted to simple relations. That entire relations will have their own 
instance of transposition is easy to guess: it will be a variant of the power-transpose 
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imposing non-empty power objects (see exercise 4.45 in [6]). Dually, by (3) we will 
obtain a method for reasoning about surjective and injective relations. 

We conclude our study of relational transposition by relating it with a data repre- 
sentation technique known in computer science as hashing. This will require further 
restricting the class of the transposable relations to coreflexive relations. On the other 
hand, the transpose combinator will be enriched with an extra parameter called the 
“hash function”. 

7 The Hash Transpose 

Hashing. Hash tables are well known data structures [21, 12] whose purpose is to effi- 
ciently combine the advantages of both static and dynamic storage of data. Static struc- 
tures such as arrays provide random access to data but have the disadvantage of filling 
too much primary storage. Dynamic, pointer- based structures (eg. search lists, search 
trees etc.) are more versatile with respect to storage requirements but access to data is 
not as immediate. 

The idea of hashing is suggested by the informal meaning of the term itself: a large 
database file is "hashed” into as many “pieces” as possible, each of which is randomly 
accessed. Since each sub-database is smaller than the original, the time spent on access- 
ing data is shortened by some order of magnitude. Random access is normally achieved 

by a so-called hash function, say B — A , which computes, for each data item a 
(of type A), its location h a (of type B ) in the hash table. Standard terminology regards 
as synonyms all data competing for the same location. A set of synonyms is called a 
bucket. 

Data collision can be handled either by eg. linear probing [2 1] or overflow handling 
[12]. The former is not a totally correct representation of a data collection. Overflow 
handling consists in partitioning a given data collection S C A into n- many, disjoint 
buckets, each one addressed by the relevant hash index computed by ft 8 . 

This partition can be modelled by a function t of type BA — — B and the so- 
called “hashing effect” is the following: the membership test a £ S (which requires an 
inspection of the whole dataset S ) can be replaced by a £ t(h a) (which only inspects 
the bucket addressed by location h a). That is, equivalence 

a £ S = a £ t(h a) (50) 

must hold for t to be regarded as a hash table. 

Hashing as a transpose. First of all, we reason about equation (50): 
a £ S = a £ t(h a) 

= { introduce b = h a } 

a £ S f\b = h a = a £ {tb) 



In fact, such buckets (“collision segments”) are but the equivalence classes of Jeer h restricted 
to S (note that the kernel of a function is always an equivalence relation). 
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= { introduce a = a' } 

a £ S A a = a' Ab = h a' = a £ (t b) 

= { introduce S' as a coreflexive ; converse of hash function } 

aSa' A a'h°b = a £ ( tb ) 

= { relational composition and rule (4) } 

a(S ■ h°)b = a(£ ■ t)b 
= { going pointfree } 

S ■ h° = £ ■ t 

= { power transpose } 

t = A(S-h°) 

g 

So, for an arbitrary coreflexive relation A — — A , its hash-transpose (for a fixed 
hash function B — — A ) is a function p A — B , satisfying 



£ ■ t = S ■ h° A ^ — A 



VA 



t 



I k ° 

B 



By defining 

O h S = A(S ■ h°) (51) 

we obtain a /i-indexed family of hash transpose operators and associated universal prop- 
erties 



t = O h S = £ ■ t = S ■ h° (52) 

and thus the cancellation law 



G • (Oh S) = S ■ h° 



(53) 



etc. 

In summary, the hash-transpose extends the power-transpose of coreflexive relations 
in the sense that A = (0,d)- That is, the power-transpose is the hash-transpose using id 
as hash function. In practice, this is an extreme case, since some “lack of injectivity” is 
required of h for the hash effect to take place. Note, in passing, that the other extreme 

case is h = ! 4 , where 1 — — — A denotes the unique function of its type: there is a 
maximum loss of injectivity and all data become synonyms! 
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Hashing as a Galois connection. As powerset-valued functions, hash tables are ordered 

by the lifting of the subset ordering VA — VA defined by <= G \ G, recall (32). 

That the construction of hash tables is monotonic can be shown using the relational 
calculus. First we expand <: 

t < t' 

= { pointwise ordering lifted to functions (8) } 

tc<-t' 

= { definition of the subset ordering (32) } 

tc(e\e)-f 

= { law (R \ S) • f = R\(S • f) [6], since t' is a function } 

t c g \ (e • t') 

= { (G-) is lower adjoint of (g\), recall (1 1) } 

G-tCG-t' (54) 

Then we reason: 

{Oh)S < 0 9 h )R 

= { by (54) } 

G • (9 h )S C G • {G h )R 
= { cancellation (53) } 

S ■ h° C R ■ h° 

4= { (-h°) is monotone, cf. lower-adjoints in (11) } 

SCR 

So, the smallest hash-table is that associated with the empty relation _L, that is A±, 
which is constant function t = 0, and the largest one is t = Ah°, the hash-transpose of 
id a- In set-theoretic terms, this is A itself, the “largest” set of data of type A. 

That the hash-transpose is not an isomorphism is intuitive: not every function t 
mapping B to VA will be a hash-table, because it may fail to place data in the correct 
bucket. Anyway, it is always possible to “filter” the wrongly placed synonyms from t 
yielding the “largest” (correct) hash table t' it contains, 

t' =t(l A(h°) 

where, using vector notation [4], f li g is the lifting of D to powerset-valued functions, 
(/ h g)b = (/ b) fl ( g h) for all b. In order to recover all data from such filtered t' we 
evaluate 



rn 8 (G • t') 
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where rng R (read “range of R”) means img R D id. Altogether, we may define a func- 
tion on powerset valued functions Effi = rng (£ • (t fl A(h°))) which extracts the 
coreflexive relation associated with all data correctly placed in t. By reconverting Eht 
into a hash- table again one will get a table smaller than t : 

O h (E h t) < t (55) 

(See proof in [18].) Another fact we can prove is a “perfect” cancellation on the other 
side: 



E h {0 h S) = S (56) 

(See proof in [18].) These two cancellations, together with the monotonicity of the 
hash transpose Oh and that of E/, (this is monotone because it only involves mono- 
tonic combinators) are enough, by Theorem 5.24 in [1], to establish perfect Galois 
connection 



O h S<t = S C rng (G • (t n A{h°))) 

~ e h r\ 

cf. diagram {S \ S C IAa} ( VA) B . Being a lower adjoint, the hash- 

transpose will distribute over union, Oh(R U S) = (OhR) U ( OhS ) (so hash-table 
construction is compositional) and enjoy other properties known of Galois connections. 

From (56) we infer that Oh (resp. Eh) is injective (resp. surjective) and so can 
be regarded as a data representation (resp. abstraction ) in the terminology of Fig. 1, 
whereby typical “database” operations such as insert, find, and remove (specified on 
top of the powerset algebra) can be implemented by calculation [16, 18]. 

8 Conclusions and Future Work 

Functional transposition is a technique for converting relations into functions aimed 
at developing the algebra of binary relations indirectly via the algebra of functions. A 
functional transpose of a binary relation of a particular class is an “F-resultric” function 
where F is a parametric datatype with membership. This paper attempts to develop a 
basis for a theory of generic transposition under the following slogan: generic transpose 
is the converse of membership post-composition. 

Instances of generic transpose provide universal properties which all relations of 
particular classes of relations satisfy. Two such instances are considered in this paper, 
one applicable to any relation and the other applicable only to simple relations. In either 
cases, genericity consists of reasoning about the transposed relations without using the 
explicit definition of the transpose operator itself. 

Our illustration of the purpose of transposition takes advantage of the free theo- 
rem of a polymorphic function. We show how to derive laws of relational combinators 
as free theorems involving their transposes. Finally, we relate the topic of functional 
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transposition with hashing as a foretaste of a generic treatment of this well-known data 
representation technique [18]. 

Concerning future work, there are several directions for improving the contents of 
this paper. We list some of our concerns below. 

Generic membership. Our use of this device, which has received some attention in 
the literature [6, 10, 11], is still superficial. We would like to organize the taxonomy 
of binary relations in terms of morphisms among the membership relations of their 
“characteristic” transposes. We would also like to assess the role of transposition in 
the context of coalgebraic process refinement [14], where structural membership and 
inclusion seem to play a prominent role. 

The monadic flavour. Transposed relations are “F-resultric” functions and can so be 
framed in a monadic structure wherever F is a monad. This is suggested in the study of 
the power-transpose in [6] but we haven’t yet checked the genericity of the proposed 
constructs. This concern is related to exploiting the adjoint situations studied in [9, 8] 
and, in general, those involving the Kleisli category of a monad [2], 

Generic hashing. Our approach to hashing in this paper stems from [16]. “Fractal” 
types [17] were later introduced as an attempt to generalize the process of hash table 
construction, based on characterizing datatype invariants by sub-objects and pullbacks. 
In the current paper we could dispense with such machinery by using coreflexive re- 
lations instead. The extension of this technique to other transposes based on Galois 
connections is currently under research [18]. 
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A Proof That All Simple Relations Are M a?/6e-Transposable 

We want to prove the existence of function 7jd+i which converts simple relations into 
(Id + l)-resultric functions and is such that 7jd+i = (€id+r)°> that is, 

e ■ f = R = f = r r 

omitting the Id + 1 subscripts for improved readability. Our proof is inspired by [9]: 

f = r R 

= { introduce id } 

id- f =r R 

= { coproduct reflexion } 

[h,i 2 }-f = rR 

= { uniqueness of 1 1 — 1 = id} 

[ii,i2 f = r R 

= { require “obvious” properties (57,58) below} 

[r id, r _ l ] • / = r r 

= { see (63) below } 

(r[id,±]) - f = r r 

= { the required fusion law stems from (62) below } 

(r[id,±])-f = rR 

= { r is injective, see (60) below } 

[id, -L] • / = R 
= { recall (24) } 

€■ f = R 

A number of facts were assumed above whose proof is on demand. Heading the list 
are 



r± = i 2 ■ ! 
rf = h-f 



(57) 

(58) 



which match our intuition about the introduction of “error” outputs: totally undefined 
relation _L should be mapped to the “every wher e- Nothing''' function 12 • !, while any 
other simple relation R should “override” (2 • ! with the (non -Nothing) entries in i\ ■ R. 
Clearly, entirety of R will maximize the overriding - thus property (58). 

r o ( rR)° 

s- A expressed by 



Arrow B + 1 



r R 



A suggests its converse B + 1 
(ri?)° = [R°,---] 



( 59 ) 
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which is consistent with (57) and (58) - it is easy to infer (P _L)° = [_L°, !°] and 
(P f)° = [f°, -L] from (16) - and is enough to prove that P has € = as left-inverse, 

£-r = id (60) 

that is, that P is injective. We reason, for all R: 

i° ■ FR= R 

= { take converses } 

(pp)° ■i l = R° 

= { assumption (59) } 

[R°,---]-ii = R° 

= { coproduct cancellation } 

R° = R° 

The remaining assumptions in the proof require us to complete the construction of 
the transpose operator. Inspired by (57) and (58), we define 

r, d+1 i? d =i f (i 2 • !) t (*! • R) (61) 

where R f S, the “relation override” operator 9 , is defined by ( R ■ (id — ker S)) U S, or 
simply by i?f S = S<S\>R if we resort to relational conditionals [1], This version of 
the override operator is useful in proving the particular instance of fusion (29) required 
in the proof: this stems from 

(R}S)-f=(R-f)US-f) (62) 

itself a consequence of a fusion property of the relational conditional [1 ]. 

It can be checked that (61 ) validates all other previous assumptions, namely (57,58) 
and (59). Because R f S preserves entirety on any argument and simplicity on both 
(simultaneously), r R will be a function provided R is simple. 

The remaining assumption in the proof stems from equalities 

[r id, P'4] = r[id, _L] = r(i\) = img i\ U img i 2 = id (63) 

which arise from (61) and the fact that i\ and i 2 are (dis)jointly surjective injections. 



9 This extends the map override operator of VDM [7], 
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Abstract. Theories of programming languages formalize pointers by 
formalizing the addresses, the heap and the stack of a computer storage. 
These are implementation concepts. The aim of this paper is a theory 
that formalizes pointers in terms of concepts from high-level program- 
ming languages. We begin with a graph theory, which formalizes the 
implementation concepts but avoids some common distinctions. From it, 
we calculate the theory of trace equivalences, which formalizes concepts 
of high-level programming languages. From that theory, we calculate 
definitions in terms of weakest (liberal) preconditions. We consider the 
assignment and the copy operation, which is introduced in the paper; the 
object creation (i.e. the new-statement) is a sequential composition of 
them. Those w£p/wp-definitions and the concept of trace equivalence are 
the result of the paper. They are intended as a foundation for program 
design; in particular, for an object-oriented one. 



0 Introduction 

By pointer theory, I mean a mathematical theory that formalizes pointers. Di- 
verse pointer theories can be found in the literature (e.g. [1, 2, 11-13, 15-18]). In 
spite of their diversity, they have two characteristics in common. 

First, these pointer theories formalize the addresses, the heap and the stack 
of a computer storage. Apart from differences in the mathematical concepts, 
the formalization is as follows. The stack holds the directly accessible program 
variables: it maps them into values. Among the values are integers, booleans, 
etc. and addresses. The heap holds the program variables that can be accessed 
only through a sequence of pointers: it maps pairs of an address and such a 
program variable into values. As an example, consider singly-linked lists. For any 
list element, let program variable data contain its integer value, and let program 
variable next point to the next list element. Let a; be a directly accessible program 
variable that points to a singly-linked list with at least two elements. The access 
to the integer value of the second list element is described as follows: 

The stack maps x to an address to. At address to of the heap begins 
the first list element. The heap maps to and next to an address n. 

At address n of the heap begins the second list element. The heap 
maps n and data to an integer. (0) 

In this way, pointers are formalized by addresses of the heap. 



D. Kozen (Ed.): MPC 2004, LNCS 3125, pp. 357-380, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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But addresses, the heap and the stack are not concepts of a high-level pro- 
gramming language. They belong to its implementation. Instead of (0), we write 
in the programming language (for example, in an object-oriented one) 

x.next.data . (1) 

Description (1) is much simpler than (0). But when we use a pointer theory, we 
must think as in (0). 

In contrast, consider control structures, for example the while-loop. When 
we design a while-loop, we do not think about the labels and jumps by which 
it is implemented. Thanks to Dijkstra’s predicate-transformer semantics and to 
Hoare’s rule for the while-loop, we can think about it much more abstractly. 

Second, pointer theories make several distinctions. They distinguish pointer 
nil from the other pointers. Since it points to no object, they do not formalize 
nil by an address. This distinguishes it from all other pointers. 

Pointer nil must not be followed. When it is followed in an expression, pointer 
theories assign an undefined value to that expression. The undefined value is yet 
another value of pointers: it is not an address, nor is it equal to nil. 

Another distinction concerns values such as integers, booleans etc. Pointer 
theories distinguish them from pointers so that they must be dealt with sepa- 
rately. These three distinctions complicate our thinking about pointers. 

Pointer theories are a foundation for program design. In particular, they are a 
foundation for object-oriented programming. (Throughout this paper, ‘address’ 
can be read as ‘object identifier’ and ‘heap’ as ‘object environment’.) Simplicity 
of pointer theories is therefore a major concern. 

In this paper, the task is to devise a pointer theory that is free of the two 
characteristics discussed above: it must not formalize addresses, the heap and 
the stack, and it must avoid the three distinctions. It is intended as a foundation 
for program design. In contrast to the pointer theories in the literature, it still 
lacks predicates and theorems that are tailored to program design. 

We must ensure that the pointer theory for which we are heading formalizes 
the usual idea of pointers and pointer operations. To ensure this, we begin with 
a pointer theory that formalizes addresses, the heap and the stack. That theory 
specializes Hoare’s and He’s graph theory [9,10]. The specialization eliminates 
the above three distinctions. In our graph theory, we define assignment, object 
creation and an operation called copy operation. We introduce the copy operation 
because it is simpler than the object creation, and the object creation can readily 
be expressed by it and the assignment. Therefore we can concentrate on the copy 
operation instead of the object creation. 

Then we eliminate addresses, the heap and the stack from the graph theory. 
A theory results that we call the theory of trace equivalences. The assignment 
and the copy operation are redefined in the new theory. These new definitions 
are calculated from the graph-theoretic ones. 

From the new definitions of the assignment and the copy operation, we cal- 
culate definitions in terms of weakest liberal preconditions and weakest precon- 
ditions. They conclude the derivation of our pointer theory. 
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The result of the paper is the concept of trace equivalence and the defi- 
nition of the assignment and the copy operation in terms of weakest (liberal) 
preconditions. Thus, we have a pointer theory that uses only concepts of high- 
level programming languages. We intend this result as a foundation for program 
design; in particular, for the design of object-oriented programs. 

The following notions and notations are used in the paper. Function applica- 
tion is denoted by an infix point All functions are total. Classical substitution 
is denoted by <— so that the simultaneous substitution of a list E of expressions 
for a list x of variables is denoted by x <— E. When applied to an expression, sub- 
stitution is written in prefix position. The operators have the following relative 
binding powers that decrease from line to line: 



= / 
V A 



=> <= 






Equality of pairs is defined as follows: for all aO, al, 60, 61 

(aO, al) = (60, 61) = aO = 60 A al = 61 . 

Crossed symbols, for example are defined as follows: for all a, 6 

a^ 6 = -> (a = 6) . 

Set operations have the following binding powers: * binds tighter than x, which 
binds tighter than — », which binds tighter than G. 

We often introduce names with ‘let’. Unless stated otherwise, the scope ends 
with the surrounding section whose number contains one dot (e.g. 1.0). 

1 Graphs 

We begin with a pointer theory that formalizes addresses, the heap and the 
stack. 

1.0 Modelling Pointers 

Let a program state be given in terms of addresses, the heap and the stack. At 
some addresses, let objects be allocated, which consist of program variables. The 
program state is modelled by a graph in the following way. 

The addresses at which objects are allocated in the heap are nodes of the 
graph. Integers, booleans etc. are nodes, too. (We assume that the set of ad- 
dresses and the set of integers, booleans etc. are disjoint.) Let an object be 




360 Birgit Schieder 



allocated at address m; let that object have a program variable x whose value 
is n, and let n be an integer, boolean etc. or an address. That program variable 
is modelled by a directed edge from node m to node n with label x on the edge. 
This part of the graph can be depicted as follows: 

m n 

To model pointers that point to no object, we introduce a node null ; it 
models ‘no object’. Let the object that is allocated at address m have a program 
variable y, let y’s value be a pointer that points to no object. That program 
variable is modelled by a directed edge from node m to node null with label y 
on the edge. This part of the graph can be depicted as follows: 

0 JL 0 

m null 

In this way, we treat a pointer that points to no object like any other pointer. 

In a program, some program variables are directly accessible whereas others 
can be accessed only through a sequence of pointers. To indicate direct acces- 
sibility, we introduce a node root. Any directly accessible program variable z is 
modelled by a directed edge from node root with label 2 on the edge: 

0^0 

root 

Thus, node root suffices to distinguish the stack from the heap. In high-level pro- 
gramming languages, no pointer points to a directly accessible program variable. 
In terms of graphs, we can say: node root has no ingoing edges. In the definition 
of graph, this requirement will be formalized by healthiness condition (2) . 

Node root serves a second purpose. There is a directed edge from node root 
to each node that models an integer, boolean, etc.; the label of such an edge 
is the respective integer, boolean, etc. For example, there is an edge from node 
root to the node that models the integer 3 with label 3 on the edge: 

root 3 

In this way, we treat addresses, ‘no object’ and values such as integers, booleans, 
etc. alike. Therefore, we can leave types out of consideration. 

Similarly, there is a directed edge from node root to node null with label nil 
on the edge: 

0^0 

root null 

Thus, name nil models a pointer that points to no object. This requirement will 
be formalized by healthiness condition (3). 
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Since we deal with program variables (e.g. x, z ) and integers, booleans, etc. 
and nil in this uniform way, one set of labels will suffice. We will call it the 
alphabet. 

In any program state, any program variable has at most one value. This 
means, for example, that node m from above cannot have a second outgoing 
edge with label x. The edges of a graph can therefore be defined by a function e: 
there exists an edge from a node to to a node n with a label x iff e.(m, x) = n. 
Thus, we avoid relations. 

Function e is total. This means that for all labels, any node has an outgoing 
edge with that label. In particular, node null has outgoing edges. When a pointer 
that points to no object is followed further, no object is reached. We therefore 
define the outgoing edges of null to lead back to null. This requirement will be 
formalized by healthiness condition (4). 

By defining e to be a total function, we avoid undefinedness. Node null plays 
a double role: it models ‘no object’, and it stands in for an undefined value. 



1.1 Definition of Graph 

Now we define our notion of graph. 

In the rest of this paper, let A be a set that contains the element nil , i.e., 

nil G A . 

We call A the alphabet. We call the elements of A labels. In the rest of this paper, 
x, y and z stand for labels. 

Let N be a set that contains the elements root and null , i.e., 
root G N and null G N . 



Let e be a function with 

e G TV x A-* N . 
Let for all n G N and for all x 



e.(n,x) ^ root . 



(2) 



Let for all n G N 



e.(n,nil) = null . 



( 3 ) 



Let for all x 



e.(null,x) = null . 



( 4 ) 



We call the pair (TV, e) a graph. We call the elements of N the nodes of the 
graph. We call (2), (3) and (4) healthiness conditions of the graph. From the 
definition, it follows immediately that root ^ null. In the rest of this paper, l, 
to and n stand for nodes. 
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1.2 Assignment 

Let ( A , /) be a graph. Let 1:1^ null and m : m ^ root be nodes of the graph. 
Let y : y ^ nil be a label. We call 



an assignment. We define it to have the following effect on graph ( A,/ ). It 
directs the edge from node l with label y to node to. All other edges and the 
node-set remain unchanged. The restriction to l that differs from null reflects 
the rule of programming languages that nil- pointers must not be followed. The 
restriction to m that differs from root reflects the rule of programming languages 
that no pointer points to the directly accessible program variables. 

Formally, we define 

(l^:=m).(N,f) 

to be the pair (A, g) where g: 

g G A x A-* A 

is the following function: for all n G A and for all x 

g.(l,y) = to (5) 

(n,x) 7^ (l,y) => g.(n,x) = f.(n,x) . (6) 

It follows immediately that ( A , g) is a graph. 

1.3 Object Creation 

Let ( A , /) be a graph. Let 1:1^ null be a node of the graph. Let y : y ^ nil be 
a label. We call 

new(l 

an object creation. We define it to have the following effect on graph ( N,f ). It 
adds a new node and directs the edge from node l with label y to the new node. 
All edges from the new node are directed to node null. 

Formally, we define the effect as follows. Let nn be a new node, i.e., let nn 
be such that nn ^ A, and let nn be uniquely determined by graph (A,/). We 
define 

(new(l -%(A,/) 

to be the pair (A U {nn}, g) where g: 

g G (A U {nn}) x A — > (A U {nn}) 
is the following function: for all n G A and for all x 

g.(l,y) = nn 
g.(nn,x) = null 

(n, x) ^ (l, y) => g.(n,x) = f.(n,x) . 

It follows immediately that (A U {nn}, g) is a graph. 



( 7 ) 

(8) 
(9) 
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1.4 Copy Operation 

In Sect. 1.3, we defined an object creation. It directs all edges from the new 
node to node null. This part of its effect can be expressed by assignments. For 
example, the edge from the new node and with label x can be directed to node 
null by the assignment 

nn := null . 

We split the object creation into a new operation, which we call copy operation, 
and assignments. It is simpler to study the copy operation in isolation because 
we study the assignment anyway. 

Let ( TV , /) be a graph. Let 1:1^ null be a node of the graph. Let y : y ^ nil 
be a label. We call 

copy{l -^) 

a copy operation. We define it to have the following effect on graph ( TV,/ ). It 
adds a new node and directs the edge from node l with label y to the new node. 
For all labels x, an edge from the new node is added to the graph; this edge is 
directed to the same node as the edge from node f.(l,y) with label x. In this 
sense, the new node is a copy of node f.(l,y). 

Formally, we define the copy operation as follows. Let nn be a new node, i.e. , 
let nn be such that nn ^ TV, and let nn be uniquely determined by graph ( TV , /). 
We define 

{copy{l )).(TV , /) 

to be the pair (TV U {nn},g) where g: 

g € (TV U {nn}) x A — » (TV U {nn}) 

is the following function: for all n € TV and for all x 

g-(l,y ) = nn 
g.(nn, x) = f.(f.{l,y),x) 

(n, x) ^ ( l,y ) => g.(n,x) = f.(n,x ) . 

It follows immediately that (TV U {nn},g) is a graph. 

2 Trace Equivalences 

Addresses and object identifiers are not concepts of a high-level programming 
language. Programs do not refer to objects by addresses or object identifiers but 
by sequences of labels (cf. (1)). Therefore we will eliminate nodes from our graph 
theory and bring in sequences of labels. 

The nodes of a graph serve a single purpose. They tell us that two sequences 
of labels lead from root to the same node or to different nodes. This is the only 
information about nodes that we will keep. 



(10) 

( 11 ) 

(12) 
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2.0 Traces 

A trace is a finite sequence of labels, i.e., an element of set A* . In the rest of this 
paper, p, r, s, t, u and v stand for traces. 

The empty trace is denoted by e. For all x, the singleton trace consisting of 
x is denoted by <x>. The append operation is denoted by the infix symbol >. 
It has a higher binding power than = but a lower binding power than the infix 
dot. Catenation of traces is denoted by the infix symbol -H- . It has the same 
binding power as symbol >. 



2.1 From Graphs to Trace Equivalences 

Let ( N , e) be a graph. We introduce the raised star * as a postfix operator. It 
has a higher binding power than the infix dot. We define a function e* from a 
trace to a node, i.e., 

e* e A* -» N , 

as follows: 

e* .e = root (13) 

and for all t, x 

e*.(t>x) = e.(e*.t, x) . (14) 

Thus, in graph ( N , e), any trace t leads from node root to node e*.t. We say that 
a node n is accessible in graph ( N , e) iff there exists t such that e* .t = n. Since 
a program refers to nodes by traces, it can refer to accessible nodes only. 

We define a binary relation ~ on the set of traces, i.e., on A*. Symbol ~ has 
the same binding power as symbol =. We define for all t, u 



t ~ u = e*.t = e* .u . 



(15) 



Theorem 0. ~ is an equivalence relation. 

Proof. Follows immediately from the definition of ~ (15). □ 

Lemma 1 . We have for all t, u, x 



t ~ u => 1 1 > x ~ u > x . 



Proof. Follows from the definition of ~ (15) and the definition of e* (14). □ 

Lemma 2. We have 

e*.<nil> = null . 

Proof. Follows from the definition of e* (14) and healthiness condition (3). □ 

Lemma 3. We have for all t 



t > nil ~ <nil> . 
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Proof. Follows from the definitions of ~ (15) and e* (14) and from healthiness 
condition (3) and Lemma 2. □ 

Lemma 4. We have for all x 

<nil>>x ~ <nil> . 

Proof. Follows from the definitions of ~ (15) and e* (14) and from Lemma 2 
and healthiness condition (4). □ 

The distinction between the stack and the heap boils down to the following 
theorem. 

Theorem 5. We have for all t 

t ~ £ = t = £ . (16) 

Proof. When t = e, (16) follows immediately. When f / £, it follows from 
definition (13) and healthiness condition (2). □ 

Theorem 6. We have for all t, u, v 

t~u => t-H-v ~ u-Wv . (17) 

Proof. Follows by structural induction on v and Lemma 1. □ 

Theorem 7. We have for all t, u 

1 4F <nil> -H- u ~ <nil> . (18) 

Proof. The proof is by structural induction on u. The base case, i.e., (18) with 
u <— £, follows immediately from Lemma 3. The inductive case follows from 
Lemma 1 and Lemma 4. □ 

We call a binary relation on A* a trace equivalence iff it is an equivalence relation 
and satisfies (16), (17) and (18). We call (16), (17) and (18) healthiness conditions 
of trace equivalences. 

Theorem. ~ is a trace equivalence. 

Proof. Follows immediately from Theorems 0, 5, 6 and 7. □ 

We say that graph (N, e) induces trace equivalence ~. In the pointer theory we 
are deriving now, program states are trace equivalences. 

The following two lemmata will be used later. 

Lemma 8. We have for all t 

e* .t = root = t = £ . 

Proof. Follows immediately from the definition of _* (13) and Theorem 5. □ 

Lemma 9. We have for all t 

e* .t = null = t ~ <nil> . 



Proof. Follows immediately from Lemma 2. 



□ 
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2.2 From Trace Equivalences to Graphs 

Let ~ be a trace equivalence. For all t, let [f] denote t’s equivalence class with 
regard to Our task is to define a graph (N, e) that induces i.e. , such that 
for all t, u 

[t] = [u] = e*.t = e* .u . 

To ensure this requirement, we define the graph such that for all t 

e*.t = [f] . 

Since e* is a function from a trace to a node, we define node-set N to be the set 
of equivalence classes. To ensure (13), we define root to be [e]. To ensure (14), 
we define function e as follows: for all t, x 

e.([t\,x) = [tt>x\ . 

The right-hand side is independent of which element is chosen from [t] because 
of healthiness condition (17) of trace equivalences. We must define null so that 
healthiness condition (3) of graphs holds. Because of healthiness condition (18) 
of trace equivalences, we define null to be [<nil>]. Healthiness condition (2) 
of graphs follows from healthiness condition (16) of trace equivalences. Health- 
iness condition (4) of graphs follows from healthiness condition (18) of trace 
equivalences. 

We have defined a graph (IV, e), which induces Thus, we have proved that 
every trace equivalence is induced by a graph. 

2.3 Assignment 

Now we define the assignment in terms of trace equivalences. We calculate the 
definition from the graph-theoretic one. 

Let (N, f) be a graph. Let 1:1^ null and rn : rn 7^ root be nodes of the 
graph. Let l and to be accessible: let r and p be traces with 



f*.r = l 


(19) 


and 

f* .p = m . 

Let y : y yti nil be a label. We consider the assignment 


(20) 


7 v 

l — > := to . 




Let ( N,g ) be the graph (l > := m).(N,f). 

Our task is to define the assignment in terms of trace equivalences. Let graph 
(TV, /) induce the trace equivalence i.e., for all t, u 


s 

* 

II 

* 

s-, 

III 

S 

1 


(21) 
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Let graph (IV, <7) induce the trace equivalence ~ 3 , i.e., for all t, u 

t~ g u = g*.t = g*.u . (22) 

Our task is to define in terms of Hence, we have to eliminate g* from the 
definition of (22) and to bring in f*. For that purpose, we need a relation 
between g* and /* . We seek for a function a (from a trace to a trace) such that 
for all t 

g*.t = (23) 

because this is a simple relation between g* and /*. From specification (23), we 
calculate a. Our calculation is by induction because the definition of g* is by 
induction. In the base case, i.e., for t <— e, we observe 
* 

g .£ 

= {definition of _* (13)} 
root 

= {definition of _* (13)} 

/*.£ 

= {definition of a (24), see below} 

F-(a.e) . 

We define 

a.e = £ . (24) 

In the inductive case, i.e., for (23) with t *— t >x, we observe 
g*.(t > x) 

= {definition of _* (14)} 
g.(g*-t,x) 

= {inductive hypothesis (23)} 

g.(f*.(a.t),x) . (25) 

For g' s arguments in (25), we distinguish the two cases of g’ s definition (cf. (5) 
and (6)). In the first case, i.e., when (f*.(a.t),x) = ( l,y ), we observe 

(25) 

= {definition of g (5)} 
m 

= {specification of p (20)} 
f*-P 

= {definition of a (26), see below} 
f*.(a.(t>x)) . 

For the premise of the case in question, we observe 
(f*.(a.t),x) = ( l,y ) 

= {equality of pairs; specification of r (19)} 

) = f*.r A x = y 
= {definition of (21)} 
a.t r A x = y . 
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We define 

a.t ~frAx = y => a.{t t> x) = p . (26) 

In the second case, i.e., when (/* ,(a.t),x) ^ we observe 
(25) 

= {definition of g (6) with n <— f*.{a.t} 
f.(f*.(a.t),x) 

= {definition of _* (14)} 
f*.(a.t t> x) 

= {definition of a (27), see below} 
f*.{a.{t > x)) . 

We define 

->(ai ~/ r A x = y) => a.{t > x) = a.t>x . (27) 

Thus, we have defined a so that (23) holds. In particular, a is total. 

Now we return to our main task. We eliminate g* from the definition of 
(22) and bring in /*. We observe for any t, u 

t ~ g u 

= {definition of (22)} 

g*.t = g* :u 

= { (23) and (23) with t <— u} 

f* .{a.t) = f*.{a.u) 

= {definition of ~/ (21) with t,u <— a.t, a.u} 
a.t a.u . 

It remains to express the premises l ^ null and m ^ root in terms of traces 
and We observe 

m ^ root 

= {specification of p (20)} 
f* .p ^ root 

= {Lemma 8 with e, t <— f,p} 

p^£ 

and 

l ^ null 

= {specification of r (19)} 
f*.r ^ null 

= {Lemma 9 with e, t, ~ <— /, r, ~ f} 

r 'pf <nil> . (28) 

Since a trace r that violates (28) cannot be excluded syntactically, we define the 
assignment for any r. We define it to terminate iff it is started in a program 
state that satisfies (28). 

In the syntax of the assignment, i.e., in l := to, we replace nodes by traces. 
Because of (19) and (20), we replace l by r and to by p\ we write 



r > y := p . 
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We restrict this assignment so that no mi-pointers are followed. For the left-hand 
side, this is ensured by (28). Since the right-hand side is non-empty, there exist 
s, z such that 

p = st> z ; 

we require s <nil>. (For the given assignment on graphs, this requirement 
is satishable: the p in (20) can be chosen so that the requirement holds.) 

We summarize the definition of the assignment in isolation from graphs. We 
introduce all names afresh. Let r and s be traces, and let y : y ^ nil and 2 be 
labels. We call 

r > y := s > 2 

an assignment. It is defined to terminate iff it is started in a program state ~ 
such that 

r ^ <nil> A s ^ <nil> . 

If the assignment terminates it does so in the program state ~ g that is defined 
as follows: for all t, u 

t ~ g u = a.t ~ a.u (29) 

where function a is defined as follows: 

a.e = £ (30) 

and for all t, x 

a.t ~ r A x = y => a.(t>x) = s>z (31) 

and 

-> (a.t ~ r A x = y) =£> a.(t>x) = a.t >x . (32) 

In the case of termination, it remains to prove that the relation ~ g defined in 
terms of ~ and a is a trace equivalence. We know that every trace equivalence 
is induced by a graph. Hence, ~ is induced by a graph, say, ( N,f ). As every 
graph does, ( f*.r := f* ,(s> z)).(N, f) induces a trace equivalence. This trace 

equivalence is relation as our calculations have proved. 

We give an example of this definition of the assignment. 

Example 0. Let xO and xl be two program variables. Let r, s <— <x0>,£, i.e., 
we consider the assignment 



<x0> > y := < 2 > . 

Let <a;0> ^ <nil>. When cr0> ~ <xl>, we observe for program state ~ 3 , in 
which the assignment terminates, 

<xO>t>y ~ g <xl >>y 

= {( 29 )} 

a.(<a;0>>y) ~ a.(<xl>>y) 

= {(33), see below} 

true . 
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From (32) with t, x <— e, xl, it follows that 

a.<xl> = <xl> . 

When <x0> ~ <xl>, it follows from (31) with t,x <— <xl >,y that 

a.(<xl>>y) = <z> . 

Similarly, it follows that 

a.(<x0> > y) = <z> . 

When <x0> ~ <xl>, we therefore have 

cr.(<x0>i>y) ~ a.(<xl >>y) ■ (33) 

When <x0> ^ <xl>, it follows from (32) with t, x <— <xl>, y that 

a.(<xl >>y) = <xl>>y . 

When <x0> ^ <xl>, we therefore have 

a.(<x0>>y) ~ a.(<xl>>y) = <z>~<xl>>y , (34) 

which we will use in a later example. 



2.4 Copy Operation 

Now we define the copy operation in terms of trace equivalences. As we did for 
the assignment, we calculate the definition from the graph-theoretic one. 

Let (N, f) be a graph. Let 1:1^ null be a node of the graph. Let l be 
accessible: let r be a trace with 



f*-r = l . 



(35) 



Let 'y : y 7 ^ nil be a label. We consider the copy operation 

copy (l -^») . 



Let (IV U {nn},g) be the graph ( copy {l ^>)).(7V, /). 

We begin with a simple lemma that will not direct our calculations. It only 
saves us the repetition of some lines of proof. 

Lemma 10. We have 



and for all t, x 



g* .£ ^ nn 

g*.(t>x) = nn = (. g*.t,x ) = (l,y) . 



(36) 

(37) 



Proof. To prove (36), we observe 
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g*.£ 

= {definition of _* (13)} 
root 

^ {root e N; nn £ N} 
nn . 

To prove (37), we observe for any t, x 

g*.{t > x) = nn 
= {definition of _* (14)} 
g.(g*.t, x) = nn 

= {definition of g (10), (11), (12)} 

(g*.t,x) = (, l,y ) . 

□ 

Our task is to define the copy operation in terms of trace equivalences. Let 
graph ( N,f ) induce trace equivalence i.e., for all t, u 

t~fu = f*.t = f*.u . (38) 

Let graph ( N U {nn},g) induce trace equivalence ~ 9 , i.e., for all t, u 

t u = g*.t = g*.u . (39) 

Our task is to define in terms of ~ /. Hence, we have to eliminate g* from the 

definition of (39) and to bring in /*. For that purpose, we need a relation 
between g* and /*. The simplest relation would be equality for all arguments, 
i.e., for all t 

g*.t = f*.t . 

But this equality does not hold for all t: g*.t may equal nn, but f*.t may not 
because nn is not a node of graph (JV, /) . Therefore we try the following weaker 
formula: for all t 

g*.t^nn => g*.t = f*.t . (40) 

The proof of (40) is by structural induction on t because the definition of g* is 
by induction. In the base case, i.e., for t *— e, we observe for (40)’s consequent 

g*.e 

= {definition of _* (13)} 
root 

= {definition of _* (13)} 
f*.e . 

In the inductive case of (40), i.e., for t <— t > x, we observe 

g*.(t > x) jtz nn => g*-(t > x) = f*.(t > x) 

= {Lemma 10 (37); definition of _* (14), twice} 

(g*.t,x) ^ (Z,y) =» g.(g*.t,x) = f.(f*.t,x) . 



(41) 
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In (41), g has the arguments g*.t and x. For them, we distinguish the three cases 
of g’s definition (cf. (10), (11) and (12)). In case (10), i.e., when (g*.t, x ) = (l, y), 
(41) holds trivially because its antecedent is equivalent to false. In case (11), 
i.e., when g* .t = nn, we observe for (41) ’s consequent 

g.{g*.t,x) = f.(f*.t,x) 

= {definition of g (11)} 
f.(f.(l,y),x) = f.(f*.t,x) 

<= {Leibniz’s Rule} 

f.(l,y) = f*.t. (42) 

Since we cannot simplify (42), we exploit the premise g* .t = nn. From Lemma 10, 
it follows immediately that there exists u such that 

t = u>y A g* .u = l . (43) 



To prove (42), we observe 
= {second conjunct of (43)} 

= {by (43) ’s second conjunct, g*.u ^ nn; by (43) ’s first conjunct, u is a prefix 
of t; inductive hypothesis (40) with t <— rt} 
y) 

= {definition of _* (14)} 

f*-( u > y) 

= {first conjunct of (43)} 
f*.t . 

In case (12), i.e., when g*.t € N A ( g*.t,x ) ^ ( l,y ), we observe for (41)’s 
consequent 

g.(g*.t,x) 

= {definition of g (12) with n <— g* .t} 

= {premise g* .t £ N; inductive hypothesis (40)} 

/•(/*•*,*) • 

Hence, we have proved (40). The calculation following formula (43) proves an 
additional result: we have for all t 

g*.t = nn => f.(l,y) = f*.t . 

From this result, it follows immediately that for all t, u 

g* .t = nn A g*.u = nn => f*.t=f*.u . (44) 

In (40) and (44), we have found a relation between g* and /*. 

Now we return to our main task. We eliminate g* from the definition of 
(39) and bring in /*. We observe for any t, u 
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t ~ g u 

= {definition of (39)} 
g*.t = g*.u 

= {Law of the Excluded Middle} 
g*.t = g*.u A ( g* .t = nn V g*.t ^ nn) 

= {A distributes over V} 

(g* .t = g*.u A g*.t = nn) V (g* .t = g* .u A g* .t ^ nn) 

= {= is transitive, twice} 

(g*.t = g*.u A g*.t = nn A g* .u = nn) V (g* .t = g*.u A g* .t ^ nn A <?*.n ^ nn) 
= {= is transitive} 

(g*.t = nn A g*.u = nn) V (g* .t = g*.u A g*.t /nnA g*.u ^ nn) 

= { (40) and (40) with t <— u} 

(g*.t = nn A g*.u = nn) V (/*.f = f*.u A nn A g* .«/ nn) 

= ' {(44)} 

(/*.t = f*.u A g* .t = nn A g*.n = nn) V 
(/*.t = f*.u A g* -t nn A g*.u ^ nn) 

= {A distributes over V} 

f*.t = /*.n A ((g*.t = n?r A g*.u = nn) V (g*.f ^ nn A g*.n ^ nn)) 

= {definition of (38); predicate calculus} 
t u A (g*.f = nn = g*.u= nn) . 

It remains to eliminate < 7 * and nn from the last line. We seek for a predicate (3 
such that for all t 

[3.t = g*.t = nn . (45) 

We calculate j3 from specification (45). Our calculation is by structural induction 
on t because the definition of g* is by induction. In the base case, i.e., for (45) 
with t <— e, we observe 

g*.£ = nn 

= {Lemma 10 (36)} 
false 

= {definition of (3 (46), see below} 

(3.e . 

We define 

—>/?.£ . (46) 

In the inductive case, i.e., for (45) with t t>x, we observe 

g*.(tt> x) = nn 
= {Lemma 10 (37)} 
g*.t = l A x = y 
= {l ^ nn} 

g*.t = l A g*.t ^ nn A x = y 

= {(40)} 

f*.t = l A g*.t ytz nn A x = y 
= {specification of r (35)} 
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f*.t = f*.r A g*.t /nn A x = y 

= {definition of (38) with u <— r; inductive hypothesis (45)} 
t r A — A x = y 
= {definition of f3 (47), see below} 

(3.(tf>x) . 

We define for allt, x 

(3.(t>x) = t~fr A x = y A ->/3.t . (47) 

This definition of (3 concludes our redefinition of ~ 3 . In particular, (3 is total. 

It remains to express the premise l ^ null in terms of traces and We 
observe 

l yT null 

= {specification of r (35)} 
f*.r yT null 

= {Lemma 9 with e, £, ~ <— /, r, ~ /} 

r^f<nil> . (48) 

Since a trace r that violates (48) cannot be excluded syntactically, we define the 
copy operation for any r. We define it to terminate iff it is started in a program 
state that satisfies (48). 

We summarize the definition of the copy operation in isolation from graphs. 
We introduce all names afresh. Let r be a trace, and let y : y ^ nil be a label. 
We call 

copy(r t> y) 

a copy operation. It is defined to terminate iff it is started in a program state ~ 
such that 

r rjj <nil> . 

If the copy operation terminates it does so in the program state that is 
defined as follows: for all t, u 

t ~ g u = t ~ it A p.t = f3.u (49) 

where predicate [3 is defined as follows: 



->/?.£ 



(50) 



and for all t, x 

/3.(t>x) = t~r A x = y A -> (3.t . (51) 

In the case of termination, it remains to prove that the relation defined in 
terms of ~ and (3 is a trace equivalence. We know that every trace equivalence is 
induced by a graph. Hence, ~ is induced by a graph, say, (TV, /). As every graph 
does, ( copy(f*.r ^>)).(TV, /) induces a trace equivalence. This trace equivalence 
is relation as our calculations have proved. 
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We give an example of this definition of the copy operation. 

Example 1. Let x be an additional program variable. Let r <— e, i.e., we consider 
the copy operation 

copy(<y>) . 

For program state ~ fl , in which the copy operation terminates, we observe 
<x> ~ g <y> 

= {( 49 )} 

<x> ~ <y> A (3.<x> = (3.<y> 

= {(52), see below} 

false . 

We observe 

f3.<x> 

= {(51) with t <— e} 

e~eAi = t/A ->(3.£ 

= {x^y} 

false ; 

similarly, it follows that /3.<y>. Hence, we have 

(3-<x> ± /3.<y> , (52) 



which concludes the example. 

3 Weakest (Liberal) Preconditions 

The theory of trace equivalences contains only concepts of high-level program- 
ming languages: traces and trace equivalences. But the assignment and the copy 
operation are still defined in a style that has proved unwieldy for program design: 
they are defined as functions that map the program state in which an operation 
is started to the program state in which it terminates (if it terminates). 

Now we define the assignment and the copy operation in terms of weakest 
liberal preconditions and weakest preconditions [7,8]. We derive the definitions 
from those given in terms of trace equivalences. 

In the rest of Sect. 3, let ~ be a variable that stands for the program state. 
We denote the ‘everywhere ’-operator by a pair of square brackets. It quantifies 
universally over ~, i.e., over all program states. 

3.0 Deterministic Statements 

In the following, we write to indicate a syntactic argument. Let “S” be a 
deterministic statement, i.e.: started in any program state, “S” has exactly one 
computation. By definition of wp, the computation terminates iff “S” is started 
in a program state in which 



wp. “S” .true 



(53) 
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holds. We define a function S (without the as follows: it maps a program 
state that satisfies (53) and in which “S” is started to the program state in which 
the computation of “S” terminates. 

Let P be a predicate that depends on variable When ~ follows an opening 
bracket immediately or precedes a closing bracket immediately, then it is treated 
like an alphanumeric name. Since (~) <— S.(~) substitutes S.(~) for variable ~, 



we have 


[wlp.“S”.P = 


wpPS” .true => (~ 


- s . ~).P] 


(54) 


and 


III 

h 


wp. “S'” .true A (~ <- 


-S.~).P] . 


(55) 



For the sake of simplicity, we abbreviate (~ <— S. ~) by S. Precisely, we specify 
a predicate transformer S by 



[wp. “S” .true =» (S.P = (~<-S. ~).P)] . (56) 

From (56) it follows immediately that in all program states with (53), predicate 
transformer S distributes over the boolean operators (V, A, =, ^ , =>, ■$=, ->) 
and over the logical quantifiers (V and 3). Values such as integers, booleans, 
labels, traces, etc. are independent of the program state ~. It therefore follows 
immediately from (56) with P <— E = F that for all values E, F 

[wp.“S” .true => (S.(E = F) = E = F)] . 

Traces are independent of the program state It therefore follows immediately 
from (56) with P «— t ~ u that for all t, u 

[wp PS" .true => {S.(t ~ u) = ( t,u ) £ S. ~)] . (57) 

In (57), we can simplify the right-hand side of the equivalence when “S” is an 
assignment or a copy operation. In the simplification, we will now exploit their 
definitions that were given in terms of trace equivalences. 

3.1 Assignment 

Let 

r > y := s > z 

be an assignment. From the definition of its termination in Sect. 2.3, it follows 
immediately that 

[wpPr > y := s > z” .true = r ^ <nil> A s^<nil>] . (58) 

Let function a be defined by (30), (31) and (32). When r ^ <nil> A <nil>, 

we observe for any t, u 

(ri>y:=st>z).(t~u) 

= {(57) with S <— (r > y := s > z )} 

(t,u) £ (r > y := s > 2 ).(~) 

= {(29) with <— (r c> y := s > 2).(~)} 

a.t ~ a.u . 
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Hence, we have for all t, u 

[r ^ <nil> A s 7 ^ <nil> => 

((r > y := s > z).(t ~ u) = a.t ~ a.u)] . (59) 

This formula concludes the definition of wlp and wp for the assignment. 
Example. We continue Example 0. We consider the assignment 

<x0> i >y := <z> . 

We calculate the weakest liberal precondition of predicate <x0 >f> y ~ <xl >>y. 
We observe 

wlp. “<x0> > y := <z>” .(<a: 0 > c> y ~ <xl> > y) 

= {( 54 )} 

wp.“<xO> > y := <z>” .true => (<x0> t> y := <z>) .(<x0> l> y ~ <xl> t> y) 
= {(58) and (59), both with r,s <— <a;0>,e} 

<x0> 7 ^ <nil> A £ <nil> => a.(<x0> > y) ~ a.(<xl> > y ) 

= {healthiness condition (16); Example 0: (33)} 

<x 0 > 7 ^ <nil> =>■ (<a; 0 > 7 ^ <xl> => a.(<a: 0 > > y) ~ a.(<xl> > 3 /)) 

= {Example 0: (34)} 

<;r 0 > 7 ^ <nil> => (<x 0 > 7 ^ <xl> => <z> ~ <a;l> > y) 

= {predicate calculus} 

<x0> ~ <nil> V <a;0> ~ <a;l> V <z> ~ <xl> c> y . 

This concludes the example. 



3.2 Copy Operation 

Let 

copy(r > y) 

be a copy operation. From the definition of its termination in Sect. 2.4, it follows 
immediately that 

[ wp.“ copy (r > y)” .true = r 7 ^ <nil>] . (60) 

Let predicate (3 be defined by (50) and (51). When r 7 ^ <nil>, we observe for 
any t, u 

(copy(r > y)).(t ~ u) 

= {(57) with S' <— copy(r > y)} 

(t,w) e (copy(r>y)).(~) 

= {(49) with (copy(r > y)).(~)} 

f~«A j3.t = (3.u . 

Hence, we have for all t, u 

[r ^ <nil> => ((copy(r > y)).(t ~ u) = t~u A /3.t = (3.u)] . (61) 

This formula concludes the definition of wlp and rep for the copy operation. 
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Example. We continue Example 1. We consider the copy operation 

copy(<y>) . 

We calculate the weakest precondition of predicate <x> ^ <y>. We observe 

wp.“copy(<y>)” .(< x> ^ <y > ) 

= {(55)} 

wp.“copy(<y>)” .true A (copy(<y>)).(<x> ^ <y>) 

= {(60) with r <— e; copy(<y>) distributes over 

£ 7 ^ <nil> A -i (copy(<y>)).(<x> ~ <y>) 

= {healthiness condition (16); (61) with r <— e} 

-> (<£> ~ <y> A j3.<x> = j3.<y > ) 

= {Example 1: (52)} 
true . 

Hence, we have 

[ u>p. u copy(<y>)” .(<x> ^ <y>)] , 
which concludes the example. 

4 Conclusion 

The result of this paper is the concept of trace equivalence and the definition of 
the assignment and the copy operation in terms of weakest (liberal) precondi- 
tions. Instead of implementation concepts, this result uses concepts of high-level 
programming languages. 

We intend the result as a foundation for program design. One hope is that 
program specification may become simpler and clearer when we do not have to 
think in terms of addresses and the heap. As an example, consider the specifi- 
cation of singly-linked lists. We define a predicate SL on a trace of integers and 
a trace. The infix symbol < denotes the prepend operation and has the same 
binding power as >. For example, we want 

SL.(0 < 1 < <2>).t 

to denote that 0<a 1 <a <2> is represented by singly-linked list t. As in the introduc- 
tion (Sect. 0), program variable data contains the integer value of a list element, 
and program variable next points to the next list element. The definition of SL 
is by structural induction on the trace of integers. We define for all t 

[SL.e.t = t ~ <nil>] ; 

we define for all integers d, traces cr of integers and t 

[SL.(d<a).t = t>data~<d> A SL.cr.(t t> next) } . 

Another hope is that program derivation may be free of operational arguments. 
The classical explanation of object creation is quite operational. It says that 
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object creation allocates space on the heap, perhaps assigns values to program 
variables, returns the initial address of the space and assigns the address to a 
program variable, and it says that distinct executions of an object creation allo- 
cate distinct spaces and return different addresses. Many programming theories 
formalize this explanation directly. It is reflected even in Hoare-rules, as e.g. 
in [1]: ‘The rule for object construction has a complicated transition relation, 
but this transition relation directly reflects the operational semantics.’ Although 
other rules are less operational (e.g. [2-4,6]), they are still rather complicated. 

To be useful in program design, the presented pointer theory needs to be 
extended. Useful predicates on traces have to be defined. Hoare-rules that are 
tailored to practical needs have to be derived. In particular, Hoare-rules are 
needed for the object creation, which so far has been expressed as a sequential 
composition of the copy operation and assignments. 

The pointer theory is intended especially as a foundation for object-oriented 
programming. When object-oriented concepts are added, their formalizations can 
be adopted from existing theories. Of particular interest are theories that leave 
pointers out of consideration [5]. Especially in object-oriented programming, 
implementation concepts still determine our thinking. The notations of object- 
oriented programming, however, allow it to be more abstract. 
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Abstract. This paper describes a refinement-based development method 
for mobile processes. Process mobility is interpreted as the assignment or 
communication of higher-order variables, whose values are process con- 
stants or parameterised processes, in which target variables update their 
values and source variables lose their values. The mathematical basis for 
the work is Hoare and He’s Unifying Theories of Programming (UTP). 
In this paper, we present a set of algebraic laws to be used for the devel- 
opment of mobile systems. The correctness of these laws is ensured by 
the UTP semantics of mobile processes. We illustrate our theory through 
a simple example that can be implemented in both a centralised and a 
distributed way. First, we present the 7r-calculus specification for both 
systems and demonstrate that they are observationally equivalent. Next, 
we show how the centralised system may be step-wisely developed into 
the distributed one using our proposed laws. 

Keywords: Mobile processes, refinement, UTP, higher-order program- 
ming. 



1 Introduction 

Mobile processes are becoming increasingly popular in software applications; 
they can roam from host to host in a network, carrying their own state and 
program code. A common example is a Java applet that is downloaded from a 
server across the Internet, and then executed in a client. Such processes bring 
new programming problems, and so require their own theoretical foundations 
for the correct and rigorous development of applications. 

The most well-known formal theory for mobile processes is the 7r-calculus [7, 
8, 13], where processes may exchange channel names, and thus change their inter- 
connections dynamically. The higlrer-order 7r-calculus (H07T in short) [12] treats 
mobility by exchanging processes themselves. The 7r-calculus and the H07 t are 
usually given an operational semantics, and then provided with an equivalence 
relation for comparing processes using bisimulation. Our interest lies in com- 
bining programming theories; Circus [15] combines theories of sequential and 
concurrent programming. Unfortunately, it is not easy to combine theories that 
are based on operational semantics, because of the fragility of results based on 
induction [5]. We are interested in refinement calculi, which arise more naturally 
from clenotational semantics. Finally, we are interested in the phenomenon of 
divergence, which is ignored in the 7r-calculus and its variants. 
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We propose a theory of mobile processes that is suitable for refinement, so 
that it can support step-wise development of mobile systems. Especially, starting 
with an abstract specification, we develop a distributed system by using mobile 
processes, in which each step of the derivation is provable in the theory. 

In this paper, we present our initial results, a set of algebraic laws. The math- 
ematical basis for the work is Hoare and He’s Unifying Theories of Programming 
(UTP) [5], which uses a simple alphabetised form of Tarski’s relational calcu- 
lus. The correctness of these laws is proved using the UTP semantics for mobile 
processes. We study a simple example in both 7r-calculus and our theory. The 
results show the suitability of our laws for the step-wise development of a mobile 
system from a centralised specification. 

Process mobility is exhibited in the higher-order variable assignment or com- 
munication, in which both the source and the target are variables, and have 
process behaviours as their values. When a higher-order mobile- variable assign- 
ment or communication takes place, the target variable is updated, more or 
less as one would expect; but at the same time, the source variable becomes 
undefined. In this way, processes are moved around in the system. 

The presence of process- valued variables has an impact on monotonicity with 
respect to refinement, and we require that process assignment should be mono- 
tonic in the assigned value. This follows the treatment of higlrer-order program- 
ming in [5]. 

The remainder of this paper is organised as follows. The next section gives 
a simple example implemented in both a centralised and a distributed way, 
and shows that they are observationally equivalent in the 7r-calculus. Section 3 
presents the syntax of mobile processes and an brief overview of its UTP seman- 
tics. In Section 4, we illustrate our development method for distributed systems 
through a set of laws; thereafter, we apply the laws in the example in Section 5. 
We conclude the presented work in Section 6 and outline some future work. 



2 A Simple Example in the 7r-Calculus 

Suppose that there is a data centre that needs to analyse data based on informa- 
tion residing in different hosts on a network. For simplicity, we assume that this 
analysis amounts to getting the sum of the data in each host. This small appli- 
cation can be implemented in two ways. In the first implementation (Figure 1), 
the data center directly communicates with each host, one by one, getting the 
local information and then updating its running total; all calculations are carried 
out in the data centre. This implementation is very simple, and so is obviously 
correct. 

In the second implementation (Figure 2), similar pieces of specification are 
abstracted and encapsulated in a parameterised process variable, which travels 
from the data centre and roams the network to each host, taking its own state 
and operations. After its arrival at a host, it is plugged into a local channel 
Ci and activated, getting the local information and updating its running total. 
After visiting all hosts, it comes back to the data centre with the final result. 
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Fig. 1. Centralised Implementation 



Fig. 2. Distributed Implementation 



Both implementations can complete the simple task, but the latter one has a 
richer structure that might make it more useful. For example, if the nature of the 
analysis changes, then only the mobile process needs to be changed. Consider 
an international bank: it may change both data (exchange rates) and function 
(services offered to customers) frequently. If this is programmed as a mobile 
process, then the very latest upgrade arrives automatically in local branches. It 
may also be more efficient to move the process from the centre to each host, as 
local communication would replace remote communication. 

Both systems can be specified in the 7r-calculus. As the 7r-calculus syntax does 
not have assignment, we use a suitable translation to convert assignment into 
7r-terms as the composition of an input and an output action over a restricted 
(or bound) name. The 7r-calculus syntax that we adopt is from [8]. 

Definition 1 (7r-calculus assignment). 

[(f := e).P] = new h (ft(e).O | /i(t).[P]) [t € fn(P), h fn(P)] 

where the part enclosed in [ ] is the side condition that the definition should 
satisfy, and fn(P) denotes the free names of P. □ 

Observationally, the effect of the assignment of a value to variable t followed by 
executing P immediately is the same as substituting this value for all the free 
occurrences of the variable t in P. This is shown by the following simple theorem 
that states the bisimilarity between the two. 

Theorem 1 (Assignment bisimulation). 

(■ t:=e).P ss {e/t}P [t e fn(P)] 



where {e/t}P denotes the systematic substitution of e for t in P . 
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Proof. 

LHS 

= new h (ft(e).O | h(t).P) 
~ r.(0 | {e/t}P) 

«0| {e/t}P 
= RHS 



[definition of assignment] 
[strong bisimilarity] 
[weak bisimilarity] 
[structural congruence] 



□ 

The centralised system is the composition of a Centre process and n Hosts 
over restricted channel names. 

System = (new Ci, C 2 , • • • , c n )(Centre \ Hosti \ Host -2 | • • • | Host n ) 
where the Centre is defined as: 

Centre = (t := 0).ci(u).(t := t + v). ■ ■ ■ .c n (v).(t := t + v). result {t) .0 

where Ci(v) is the input of the data that needs to be retrieved from the zth host. 

In the assignment t := t+v of Centre , the t in the left hand side is a restricted 
name while the t in the right is a free name. We can better comprehend the scope 
of each t and v using renaming. The effect of Centre is the same as 

Centre = 

(to := 0).ci(wi).(fi := to + t>i)- ■ ■ ■ .c n (v n ).(t n = t„_ 1 + v n ). result (t n ) . 0 

In the H07T [12] and polyadic 7r-calculus [8], abstractions (parameterised 
processes) or multiple names can be transmitted directly along channel names. 
In the distributed implementation, an abstraction (z, in, out, done). P and the 
value of t are transmitted at the same time, where ( z,in , out , done).P and its 
execution (( 2 , in, out, done).P)(ci, U-i, tmp, tg) are defined as follows: 

( 2 , in, out , done).P = z(v).(out := in + v). done (out) .0 

(( 2 , in, out, done).P)(ci, t*_i , tmp, tg) = Ci(v).(tmp := U-\ + v).tg{tmp ). 0 

Let do,i be the name of the channel connecting Centre and Hosti, d n ^ be 
the name of the channel connecting Host n and Centre, and dyi+i (1 < i < n — 1) 
be the name of the channel connecting Hosti and Hosti+\. The specification for 
the distributed system is 

MSystem = 

(new do,ii c?i, 2 , ■ • • , d n $)(MCentre \ MHosti \ MHost 2 | • • • | MHost n ) 

The Centre now has the task of initialising its total, sending the mobile process 
and total on its way, and then waiting for the total to return home 1 , before 
outputting the result. 

MCentre = ( t := 0).do,i((z, in, out, done). P,t).d n fi(final). result (final). 0 

1 The mobile process is discarded in the last-visited host after its mission in order to 
save network cost, but extra cost arises for specifying the last host separately. 
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Each host now has an extra component: one that receives the mobile process, 
executes it, and then passes it on to the next host. This component is merely 
added to the previous behaviour of the host. 

MHosti = (new Ci){HostCi \ Hosti ) 

HostCi = 

! di-i,i(p, fi_i).(new tg)(p(ci, i*_i, tmp, tg ) \ U)-0) 

for i = 1 . . n — 1 

d n -i, n (p,t n -i).(newtg)(p(c n ,t n - 1 ,tmp,tg) \ tg(t n ).d nfi (t n ).0) 

for i = n 



In HostCi, we use a restricted name tg to make the output via possi- 

ble after p’s execution. If we discard done (out) in the transmitted abstrac- 
tion, then the value of the updated t would not be passed on to the next 
host, and p(ci,ti-i,tmp).di^+i(p,ti).0 would be syntactically wrong, because 
p(ci , ti- 1 , tmp) is a process rather than a prefix. 

The centralised and distributed systems are observationally equivalent, and 
the 7r-calculus can be used to verify this. So, if we are convinced that the cen- 
tralised system calculates the right answer, then we have an argument that the 
distributed system does so too. 

We are interested in a step-wise development process in the spirit of Mor- 
gan’s refinement calculus for sequential programs [9]. We would like to start from 
an abstract, centralised specification, and develop a concrete, distributed imple- 
mentation, proceeding in small steps that are easy to justify and that explain 
design decisions. It is not easy to follow this discipline in the 7r-calculus. Instead, 
we would like to base our language of mobile processes firmly on the notion of 
refinement, and develop sets of laws that encourage piece-wise development. In 
later sections, after we present the clenotational semantics and laws for mobile 
processes, we show this step-wise development in our proposed approach. 

3 Syntax and Semantics 

The syntax of our language is a subset of occam [6] and CSP [4,10], but en- 
riched with primitives for process variables, mobile and clone assignment and 
communication, and (parameterised) process variable activation. These mobility 
constructs are inspired by occam m [1]. 

In discussing the semantics, we make use of the following conventions for 
meta- variables, p and q range over all program variables; t ranges over data 
variables; e ranges over data; h ranges over process variables; E ranges over 
data or process values; x, y and z range over formal name, value, and result 
parameters; ne, ve, and re range over actual name, value, and result parameters; 
b ranges over boolean values; X ranges over sets of events. 

The basic elements in our model are processes, which are constructed from 
the following rules: 
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P ::= SKIP | STOP \ CHAOS \ vid p : T \ end p 

I h '■= I h := {[A x : var(Tx),y : val(T 2 ),z : res(T 3 ) • P]} 
t := e | p := m q \ p := q \ <h> \ h(ne , ve, re) 

Comm -^P\P<\bt>Q\P; Q\b*P 
I P\\Q\PaQ\pnQ\P\X 
vid ::= var \ proc 
Comm chip \ chlq \ ch\\q \ ch.E 

SKIP terminates immediately; STOP represents deadlock; CHAOS is the worst 
process and the bottom element of the complete lattice of mobile processes: its 
behaviour is arbitrary. 

The variable declaration vid p : T introduces a new variable p of type P; 
correspondingly end p terminates the scope of p. When it is clear from the 
context that p is a data variable or a process variable, we use var or proc for its 
declaration. The type T determines the range of values that p can have. When 
p is a process variable, its type determines the alphabet and interface 2 of the 
process that may be assigned to p. For convenience, we often omit types when 
they are irrelevant. 

Higher-order programming treats program as data, and higher-order assign- 
ment or communication assigns process constants to lriglrer-order variables. Pro- 
cess constants are enclosed in brackets, which have no semantic importance and 
are ignored when the process is activated. We distinguish simple process con- 
stants from parameterised ones. The higher-order constant assignment h := {[P]} 
assigns a simple process constant P to h, and h := {[A x : var(Ti ), y : val(T 2 ), z : 
res(T 3 ) • P]} assigns h a parameterised process constant, which has a body P 
and a formal name parameter x, value parameter y, and result parameter z. 

The first-order constant assignment t := e assigns a value e to the data vari- 
able t. The clone variable assignment p := q is similar to constant assignment, 
except that the term in its right hand side is a variable rather than a value. After 
this clone assignment, the value of p is updated according to </’ s value, and q 
gets a value that is better than its original one (in section 3.1, we explain what 
a better value is and why the value should be better). The notation p := m q 
denotes mobile variable assignment. On its termination, the value of the target 
variable p is updated and the source variable q is undefined, thus the result of 
any attempt of using q is unpredictable. 

The notation ch.E stands for a communication that takes place as soon as 
both participants are ready. The input prefix chip — > P accepts a message from 
the channel, assigns it to p, and then behaves like P. The mobile output prefix 
chV.q — » P transfers the value of variable q through channel ch and then executes 
P. As in mobile assignment, any attempt to use q after output is unpredictable. 
The clone output prefix chlq — > P outputs the value of q , but retains </’ s value. 

Once initialised, a process variable h may be activated by executing it, de- 
noted by <h>. A parameterised process variable can be activated by providing 

2 The interface is defined as parameters and input /output channels through which the 
process can interact with its environment. 
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parameters. If a process variable h has the value of (A x : var(Ti), y : val(T2), z : 
res( T‘2) • P ), then the effect of the activation h(ne, ve, re) is calculated by 

h(ne, ve, re) = var x := ne, y := ve, z; P; ne := x, re := 2; end x, y, z 

It initially assigns the values of actual parameters ne and ve to the formal name 
and value parameters of P , and executes P . On the termination of P , the values 
of the name and result parameters are passed back to the actual parameters ne 
and re. 

A conditional (P <\ 6 > Q) behaves as P if b is true, otherwise as Q. The 
sequential composition of two processes (P ; Q) results in a process that behaves 
as P, and, on termination of P, behaves as Q. An iteration ( b * P) repeatedly 
executes P until b is false. A parallel composition (P j Q) executes P and 
Q concurrently, such that events in the alphabet of both parties require their 
simultaneous participation, whereas the remaining events occur independently. 
An external choice (P □ Q) allows the environment to choose between P and Q, 
whereas the internal choice (P n Q) selects one of the two processes nondeter- 
ministically. P \ X hides the events in the set X, so that they happen invisibly, 
without the participation of the environment. 

3.1 Semantics 

The denotational semantics of the model is given in the style of Hoare & He’s 
unifying theories [ 5 ] . In the UTP, the semantics of a process P is given in terms of 
an implicit description of its entire behaviour using the following observational 
and program variables (denoted by aP), in which the undecorated variables 
record the initial state before P is started, and the dashed ones record the 
intermediate or final state at the time of observation. 

— A, the set of events in which P can engage 

— ok, ok 1 : B, indicating freedom from divergence 

— wait, wait 1 : B, indicating waiting for interaction with the environment 

— tr, tr': seqA, recording traces of events 

— ref , ref: PA, indicating events currently refused 

— v, v' , program variables (including higher-order variables) 

We use obs to represent all observable variables ok , wait , tr , ref in short. P’s 
behaviour is described by a relation between undashed and dashed variables. 
By using this model, the failure can be indicated by a pair ( tr,ref ), and the 
divergence is captured by the ok variable. Therefore, we are able to reason about 
the failure and divergence of processes. 

Healthiness conditions distinguish feasible specifications or designs from in- 
feasible ones. There are five healthiness conditions for mobile processes as follows. 
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Ml P = P A (tr < tr ') 

M2 P = n{-P[s, s~(tr' — tr)/tr, tr'] \ s € seq A} 

M3 P = U < | wait > P 

where 

II = I <\ ok > (tr < tr') 

I = (ok' = ok) A (tr' = tr) A (wait' = wait) A (ref = ref) A (V □ v) 
M4 P = P <\ ok t> (tr < tr') 

M5 P = P ; J 
where 

J = (ok => ok') A (tr' = tr) A (wait' = wait) A (ref = ref) A (v' □ v) 

Ml says that a process can never undo its execution. M2 indicates that the 
initial value of trace tr has no influence on a process. M3 describes the fact 
that if a process starts in the waiting state of its predecessor, then it will leave 
the state unchanged. Mf states that if a process starts in a non-stable state, 
then we cannot predict its behaviour except trace expansion. M5 shows the fact 
that divergence is something that is never wanted, which is characterised by the 
monotonicity of ok' . All processes in our language are healthy processes. More 
details can be found in [5] and [14]. 

There are two differences between our approach and first-order program- 
ming: the refinement ordering between variables; and the semantics of higher- 
order assignment and communication. In first-order programming, we say a pro- 
cess P is refined by Q, denoted by P C Q, if for all variables in the alphabet, 
the behaviour of Q is more predictable and controllable than that of P. 

Definition 2 (Process refinement). 

P C Q = V obs , obs' , v, v 1 • Q => P 

The universal closure over the alphabet is written more concisely: [ Q => P ] . □ 

As higlrer-orcler variables hold processes as their values, the ordering between 
program variables can be defined in the light of the refinement ordering between 
their values. Two process variables can be compared only when they have the 
same types. We say a process variable h is a refinement of g, if the activation 
behaviour of h is more controllable and predictable than that of g. For first- 
order data variables, two variables are comparable only when they are equal. 
More specifically, we define the refinement ordering between variables as follows. 

Definition 3 (Variable refinement). Let p and q be two program variables 
of the same type 

pfq = 

{ p = q if p, q are data variables 

[ q(ne, ve, re) =>• p(ne, ve, re) ] if p,q are parameterised process variables 
[<g>=><p>] otherwise 



where ne, ve, and re are the actual name parameter, value parameter and result 
parameter for the activations of p and q. □ 
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First-order assignment t := e has no interaction with the environment. It 
always terminates and never diverges. On termination, it equates the final value 
of t with value e, but does not change the other variables. 

Definition 4 (First-order assignment). 

ft := e] = M34{ok' A -i wait' A tr' = tr A t' = e A v' = v) 

where v contains all program variables except t, M3{P) = U <1 wait t> P, 
M4{P) = P < ok > (tr < tr'), M34 is M3 o M4- The healthiness conditions 
are required to make sure that the assignment does not take place unless the 
preceding process terminated properly. The satisfaction of the other healthiness 
conditions can be derived from the definition [14] . □ 

As discussed above, we must give a semantics to assignment that makes it 
monotonic with respect to refinement. 

Definition 5 (Higher-order constant assignment). 

[ h := {[P]}] A M34(ok' A -> wait' A tr' = tr A h' □ P A v' □ v) 

[h, h' £ aP] 

where v contains all program variables except h. □ 

Theorem 2 (Monotonicity). Suppose that h is a higher-order variable. 

PQQ =► h := {[P]} C h := D 

It may be instructive to see what would happen if we had used equalities in 
Definition 5. 

h := {[P]} Q h := {[Q]} [refinement (definition 2)] 

= [h := {[Q]} => h := {[P]} ] [assumption (definition of assignment)] 

= [ M34{ok' A -i wait.' A tr' = tr A h' = |[Q]} A v' = v) =$■ 

M34 ( ok ' A -i wait' A tr' = tr A h' = {[P]} A»' = i/)] 

[definition of M3 and M4] 

= [ok' A -i wait ' A tr' = tr A h! = {[Q]} A»' = i/^ 

ok' A -i wait' A tr' = tr A h ' = {[P]} A v' = v] 

[one-point rule, three times] 
= [ ok' A -> wait' => ok' A -> wait' A {[<2]} = {[P]} ] [propositional calculus] 

= [ ok' A -> wait' => |[Q]} = {[P]} ] [universal closure, case analysis] 

= (P = Q) 

So, monotonicity would hold only in a trivial sense. 

For the same reason, we also adopt the inequation in the definitions of higher- 
order communication and /, U and SKIP. For first-order data, two values are 
comparable if and only if they are equal. Therefore higher-order assignment or 
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communication and SKIP are consistent with their counterpart in first-order 
programming. We extend the language, but we are interested in a conservative 
extension in which the new semantics is not only suitable for the extension part, 
but also for the original part. Interested readers may refer to [14] for more details 
about the semantics of mobile processes. 

4 Development Method for Distributed Systems 

Our development procedure involves two main steps. In the first step we abstract 
predicates and assign them to parameterised process variables (see Section 4.1). 
In the second step, by converting assignment into communication, we make the 
variable mobile: consequently, the activations of the variable can be completed 
in different hosts over a network (see Section 4.2). 

We present the development procedure through a set of algebraic properties 
and laws for mobile processes. The proofs of these laws are largely straightfor- 
ward, therefore we omit them in this paper. Some of the proofs have been given 
in [14]. 

We use two syntactic abbreviations: we write ( proc ft := {[Q]}) instead of 
( proc ft; ft := {[Q]}); and we abbreviate ( proc h\ Q ; end ft) by ( proc ft • Q) 
to represent that the scope of ft is valid in Q. 

The mobility of processes is expressed in the following law. 

Law 1 (Undefined activation) For distinct higher-order variables g and h. 

( g ~ m h ; <h>) = CHAOS [Law l.A] 

( cftHft — * <h>) = ch.h ; CHAOS [Law l.B] 

where eh is a channel name. □ 

Law 1 captures the fact that a mobile process has moved after assignment or 
communication, since its value has been passed to a new location ( g , or the other 
end of the channel ch), and none of its behaviours is available at its old location 
(the higlrer-order variable h). In Law l.A, as there is no communication with 
the environment, the update of g cannot be observed, therefore the whole effect 
is the same as CHAOS', however, in Law l.B, the execution of h still leads to 
CHAOS, but this does not undo the communication that has already happened. 

4.1 Abstracting Process Variables 

Law 2 (Process constant) A process constant is not subject to substitutions 
for the variables it contains. 

(P{x) +h ; h := -ffQ(ar)]}) = (h := -ff<9(ar)]} ; P{x) +h ) 

[ft, ft' ^ aP, ft, ft' ^ aQ] 

where P{x) and Q(x) are processes in which x occurs as a variable. Note the use 
of alphabet extension: P+h is the relation P A h! □ ft, with alphabet aPU{h, ft'}. 
This is needed in order to balance alphabets. □ 
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Law 3 (Variable introduction) We can always include a process in the scope 
of a new variable, assuming this variable is not in the alphabet of the process. 

Q = (vid p ; Q +p ; end p) \p,p' aQ } 

□ 

Law 4 (Vacuous constant assignment) The assignment of a constant to a 
process variable at the end of its scope is vacuous. 

( h := {[Q]} ; end h) = ( end h) [h, h' ^ aQ] 

The assignment h := m g ; end h would not have been vacuous, since the effect on 
g (making its value undefined) would persist beyond the end of h’s scope. □ 

Law 5 (Vacuous mobile variable assignment) The corresponding law for 
mobile variable assignment is stronger in that it requires both variable scopes to 
end. 



( p := m q ; end p ; end q) = ( end p ; end q) 

Of course, it does not matter in which order the scopes are ended. □ 

Law 6 (Vacuous identity assignment) The effect of the assignment of a 
variable to itself is the same as SKIP . 

(. V ~ P ) = (p ~m P) = SKIP 

When the mobile assignment terminates, on the right hand side, p ’s value be- 
comes arbitrary; however, on the left hand side p gets a value that refines the 
original value of right-hand-side p. Therefore, the whole conjunctive effect is that 
p refines its original value, which is the same as SKIP . □ 

Law 7 (Scope-extension/shrinkage) The scope of a variable may be ex- 
tended by moving its declaration in front of a process that contains no free oc- 
currences of it, or moving its variable undeclaration after this process. 

{P ; vid p) = ( vid p ; P +p ) 

(end p ; P) = ( P +p ; end p) [p,p ' £ aP ] 

□ 

Law 8 (Copy-rule-1) Following the meaning of process variable activation, a 
process can be replaced by the activation of a variable which has been assigned 
by this process constant before. 

(h ■= {[<2]}: Q) = (h ■■= §Q} ] <h>) [h,h'£aQ] 

Even though we have adopted inequality in higher-order assignment, this rule is 
an equality rather than a refinement. It is because h is assigned nondetermin- 
istically by a process that refines Q, consequently, activating h is the same as 
executing I I | R 3 <?}. and the result of this non- deterministic choice equals 
to the weakest process Q. □ 
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Law 9 (Copy-rule-2) Any process can be replaced by assigning this process 
constant to a newly- declared process variable and following by its activation. 

Q = ( proc h := {[Q]} • <h>) [h, h' ^ aQ] 

□ 



Law 10 (Parameterised copy-rule) The law for parameterised processes dif- 
fers in that we activate them by providing actual parameters. 



Q(i,j,k ) = 

proc h {[Air : var{T\),y : val(T 2 ),z : res(T 3 ) • Q(x,y,z)} • 
h(i,j,k) 



[ h , ti £ aQ} 



where Q{i,j,k) is a parameterised process with actual name, value and result 
parameters i, j and k of type T\, T 2 and T 3 respectively. □ 



Law 9 and Law 10 are the key rules to abstract the same or similar segments by 
a higher-order process variable. 

We introduce two notations to represent a series of sequential compositions. 

Definition 6 (Indexed sequential composition). An indexed sequential 
composition is the sequential composition of a series of processes in order. 



( %i : 1 . . n • Pi ) 



( SKIP n = 0 

\ ( %i : 1 . . n - 1 • Pi ); P n n > 1 



□ 



Definition 7 (Iterated sequential composition). Sequential composition 
may be iterated over any sequence s. 



(g* : s • P(i ) ) 



f SKIP s = () 

( P(head(s)); ( %i : tail(s) • P(i) ) s/() 



where i is one of the elements in the sequence of parameters, and P is a param- 
eterised process, head(s) is the first element of s, tail(s) is a subsequence of s 
after removing its first element. □ 



For example, we denote the program (t := t + 2; t := t + 7; t := t + 5) as 



9 * : (2,7,5) • {[A j : val(N) • t := t + j}(i) 



For a series of similar pieces of program, we may be able to assign the pa- 
rameterised process to a newly-introduced process variable, and activate it in 
series with proper arguments. 
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Law 11 (Iterated parameterised copy-rule) Suppose Q is a parameterised 
process with a value parameter of type I , then, for any sequence s, 



(§*:«• {[A j '■ val(I) • <2(j)]}(*)) 



f proc h := {[A j : val(I) • Q{j)} 
\ (g i:s»h{i)) 






[■ h , h' (f a Q\ 



□ 



For instance, by using this law, we have the following derivation 



{t : — f T 2j t : — t T t i — t T 5) — 

( proc h := {[A? : valCN) • £:=£ + ?]} 
\1 • (ft(2); h( 7); ft(5)) 



As a special case of the above law, when s = (1,2,..., n), 

(|i:l ,.n.P(i)) = ( P«>c ft := {[Aj : - CQ)]} . \ 

v w ’ \ (§* : 1 . . n • h(i ) ) J 

In the same way, we have similar laws for an iterated parameterised process 
that has name or result parameters. 



4.2 Moving Process Variables 

Even though we group similar pieces of specification as the value of a newly 
introduced parameterised process variable and activate it at necessary occur- 
rences (Law 11), the whole specification is still centralised. In order to achieve 
a distributed system, we may consider putting many activations of this variable 
in different distributed components. To make sure that the variables activated 
in different components have the same process values or similar structures, we 
initialise the variable at one component but make it mobile, transmitted from 
one distributed component to another one. It is necessary to introduce com- 
munication in this step. Actually, the assignment and the communication are 
semantically equivalent. 

Law 12 (Assignment-communication equivalence) 

(p ■- q) = {{chip -> SKIP) || {ch\q -> SKIP)) \ {ch} 

{p := m q) = {{chip - SKIP) || {ch\\q - SKIP)) \ {ch} 

□ 

We borrow the concepts of pipes and chaining in CSP [4, 10]. Pipes are special 
processes which have only two channels, namely an input channel left and an 
output channel right. For example, a pipe that recursively accepts a number 
from left , and outputs its double to right can be represented by: 

p X • leftlp — » right\{p + p) — > X 
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Definition 8 (Chaining). Chaining links two pipes together as a new pipe. 



P^i>Q = (P[mid / right] || Q [mid/ left]) \ {mid} 



□ 



An indexed chaining connects a series of processes in order as a long pipe. 

Definition 9 (Indexed-chaining). 

( i : 1 . . n • Pi ) = / „ n \ 

\ ( * : 1 . . n — 1 • Pi )^>P n n > 1 

□ 

Definition 10 (Double-chaining). Double chaining links two pipes as a ring. 

P o Q = 

{P [mid\, mida/ right, left] || Q[midi,mida/ left, right]) \ {mid\,midk } 

All the communications between pipes are hidden from the environment. □ 
The double chaining operator is commutative. 

Law 13 (Double-chaining commutative) 

P o Q = Q o P 

□ 

A ring of processes can be viewed as a long chain with the two chain ends 
connected. The order of processes in the ring is important, but the connecting 
point of the chain can be arbitrary. In other words, the chain can be started from 
any process and ended at one of its backwards adjacent process. This feature is 
captured by the following law. 

Law 14 (Exchange) 

Pi i : 2 . . n • Pi) 

= Pk <0> ((^> i : k + 1 . . n • Pi)^>{f>> i : 1 . . k — 1 • Pi)) 1 < k < n 
= P n < 0 > (» * : 1 . . n — 1 • Pi) 

□ 



The update of a variable p by an expression of p can be implemented by 
double chaining two pipes, where the first pipe mobile outputs p, while the 
second pipe inputs the value of p to r and then outputs the value of the updated 
variable immediately to the first pipe. 

Law 15 (Delegation with double-chaining) 



(P ~f(P )) 



f ( rightUp — > leftPp — > SKIP) 

y<0> ( vidr ; leftlr — » right\f(r) — > endr) 



where f(p) is an expression of p. 



□ 
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As the update is executed in the second pipe, the value of p in the first pipe 
is irrelevant after the second pipe receives it. At the same time, the value of 
the intermediate variable q is of no use after p gets its value by assignment. 
Therefore, we adopt mobile output for p and mobile assignment for q in the first 
pipe. In the second pipe, /(r) is not a variable but a value based on variable r, 
so that we use normal output for /(r). 

Similarly, the serial update of a variable can also be implemented by a ring 
of pipes, in which different updates are executed in different pipes. 

Law 16 (Serial delegation with chaining) 



O := g{f(w))) 



( rightUw — > left.lw — > SKIP 

( vidp\ leftlp —> right\f(p ) — > end p 

» 

vidq\ left?q — > rightlg(q) — > end q 



\ 

/ 



Proof. Similar to the proof of Law 15. □ 

In a more general rule, a series of updating p through different processes 
Fi(p, p') can be replaced by a loop pipelining, in which the series of update task 
are allocated in different pipes. 



Law 17 (Loop pipelining) 

(i* : 1 • • n • Fi{p,p')\ w := p) = 

( (■ rightWp — » leftlp —> w := p) 

O 

i : 1 . . n • ( vidr ; left?r — > Fi(r, r')\ rightV.r — > end r) 



□ 

In the right hand side of the above law, the value of p travels from the first pipe 
to the series of pipes. Its final value, which is stored in w, is retrieved after p’s 
travelling back from the series of pipes. 

When updating is performed by a series of activations of the same process 
variable, we can move this process variable around the loop pipelining and dis- 
tribute the activations in different pipes. 

Lemma 1 (Loop pipelining). Suppose that h is a parameterised process vari- 
able with a value parameter i and a name parameter t, then 

(%i : 1 . . n • h(i, t)\ w := t) = 

/ ( rightWhWt — * leftlhlt — > w := t) \ 

( «>> 

\f>> i : 1 . . n • (vidg,r\ left? glr — > g(i,r)\ right.WgWr — > endg,r) ) 



□ 
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In practice, t is the local state of mobile process variable h. When the mobile 
process variable moves, it takes not only its process value but also its local 
state; however, in our current work, we have not formalised the local state of a 
mobile process, therefore we simply send the local state together with the process 
variable in multiple data transfer [10], which also says that the two variables h 
and t are output at the same time. 

5 Decentralising the Data Centre 

Using the laws in Section 4, we are able to show the derivation of the distributed 
system from the centralised one in Section 2. As the behaviour of the hosts to 
send out the information is not our main concern, we simply ignore this part in 
our specification. 

In the centralised system, initially, the variable t is initialised to 0. The data 
centre then repeats the task of communicating with each host, obtaining the 
information and updating t until all the hosts have been processed. Finally the 
data centre outputs the data through channel result. This specification can be 
written as 

var t := 0 • 

%i : 1 . . n • ( c.ilv t := t + v); resultH — > SKIP 

where c.i is the channel connecting the data centre and the zth host. We notice 
that similar pieces of specification involving input and update occurs iteratively, 
therefore we can assign a parameterised process to a process variable, and then 
activate it repeatedly with proper arguments. In the parameterised process, the 
initial value of t needs to be known at the beginning of every communication with 
the host, and certainly we need to store the result of t after its update, therefore 
we select t as a name parameter. By applying the iterated parameterised copy 
rule (Law 10), we reach the step: 

= { iterated-parameterised-copy-rule} 

var t := 0 • 

f proc p := {[A j : val( 1 . . n); u : iw(N) • c.jlv — > u := u + v]} • 

V (§* : l..n»p(i,t)) 
resultH — > SKIP 

The channel index j is only of relevance at the beginning of the variable activa- 
tion, therefore it is a value parameter. 

As the output resultH does not contain variable p , it can be moved inside 
the scope of p. By an application of Law 7, we calculate the following: 

= {variable-p-scope-extension} 

var t := 0 • 

f proc p := {[A j : val( 1 . . n); u : iw(N) • c.p.v — > u := u + «]} • 
y : 1 • • n • p(i , t); resultH — > SKIP 




Travelling Processes 397 



The task of updating t by activating p can be completed using a series of 
pipes. The whole specification can be replaced by double chaining this series of 
pipes with another pipe, in which the value of process variable p and the initial 
value of t are sent out, and the final value of t is retrieved. As the intermediate 
values of t and process variable p are of no concern, we use mobile output for 
t and p. When the activations have been completed in every host, the process 
variable is of no use to us, therefore we discard it in the last-visited host n and 
only the variable w that stores the data is retrieved by the first pipe. By applying 
the loop-pipelining law (Lemma 1), we get the following specification. 



= {loop-pipelining} 
var t := 0 • 

/ proc p := {[Aj : val( 1 . . n); u : var(N) • c.jlv — ■> u := u + «]} • \ 
/ rightWpWt — > leftlt — > resultM — » SKIP \ 

O 



1 



V 



V 



( » i : 1 . . (n — 1) • 

f proc g; var w, leftlqlw — > q(i, w); 
\rightW.qW.w — > end q,w 
f proc q-, var w; leftlqlw —> q(n, w)\ 
\ yrightV.w — > end q,w 



\ 

) ) 



J 



This specification is still centralised, as the scopes of p and t are valid for 
all the pipes. We notice, however, these two variables do not occur in the pipes 
involved in updating, therefore we can take these pipes out of the scope of p and 
t. Applying the variable scope shrinkage (Law 7), we reach a distributed system. 



= { variable- t-and-p-scope-shrinkage} 

( var t := 0 • 

proc p := {[Aj : val{ 1 . . n); u : ■uar(N) • c.jlv — » u := u + v]} • 
rightWpWt — > leftlt — » resultM — > SKIP 

O 

/ / i : 1 . . (n — 1) • 

y proc q; var w; leftl q?w —> q(i,w)j rightUqUw — > endq,w 
\>>(proc q; var w, leftlqlw — > q(n,w)\ rightUw — > end q,w ) 

In the distributed system, the data centre and the hosts are arranged as a ring. 
The first component of the double chaining is the data centre, in which the 
variables t and p are mobile sent out, along channel right , after initialisation. 
The hosts are specified as a series of pipes, in which the process variable and 
the data are received from channel left , and then output to channel right after 
activation of the process variable. 




6 Conclusions and Future Work 

We have presented a set of laws for the development of mobile distributed sys- 
tems. All the laws can be proven [14] within Hoare and He’s unifying theories. 
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Through a simple example that can be implemented in both a centralised and a 
distributed fashion, we have shown these laws are suitable for a step-wise devel- 
opment, starting with a centralised specification and ending up with a distributed 
implementation, while we cannot achieve this using the 7r-calculus. 

Our current work on the semantics of mobile processes mainly focuses on their 
mobility, and our development method applies to an initial specification without 
any mobile process; however, we simply ignore the process type and rich data 
structure within a mobile process. In occam m [1], a process type determines an 
interface, a mobile process can implement multiple process types, and the value 
of a process variable is an instance of a mobile process that implements the type 
of this variable. Therefore, the activation of a process variable is determined by 
its type, and the process from which it is initialised. We formalise these issues, 
and then study the refinement of a mobile process itself. 

In occam m [1], channels have mobility. Channel variables reference only one 
of the ends of a channel bundle and those ends are mobile, and can be passed 
through channels. Moving one of the channel-bundle ends around a network 
enables processes to exchange communication capabilities, making the commu- 
nication highly flexible. We intend to formalise this channel-ends mobility in the 
UTP and study its refinement calculus. 

In the example in this paper, the derivation from the centralised system to 
the distributed one centres around the introduction of a higher-order variable. 
Actually, moving processes can decrease network cost by replacing remote com- 
munication with local communication. Furthermore, after a mobile assignment 
or a mobile communication, the source variable is undefined and its allocated 
memory space is released to the environment. Clearly, consuming less network 
cost and occupying less memory space can be a performance enhancement, and in 
this sense, we have not demonstrated a performance improvement in our mobile 
system. We intend to investigate techniques for reasoning about this. 

Our main objective is to include the semantics of mobile processes and its 
associated refinement calculus in Circus [15], a unified language for describing 
state-based reactive systems. The semantics [16] of Circus is based on UTP and a 
development method for Circus based on refinement [11, 2, 3] has been proposed. 
This inclusion will enhance the Circus model and allow Circus specification to 
reason about mobility. 
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